Categories: Crawling, indexing & ranking :

Data-centric pages suffer with article-centric Google

Showing 1-24 of 24 messages
Data-centric pages suffer with article-centric Google RalphSlate 7/9/12 10:14 AM
For the past few months I've struggled with the effects of Penguin, and its possible interplay with Panda. I have come to realize that I may be struggling because I live in a data-based world but Google lives in an article-based world.

From what I've been reading, Google wants pages that have over 400 words. To me, a data guy, this is like a foreign language. My view of the world is to present data in as compact a format as possible. I tend to think in terms of columns and rows instead of words and sentences. Because of this, the design of my entire website is from a data perspective.

I have learned from people here who have viewed my site that they are living in an article-based world. Not being specficially interested in my data, they view web pages visually, based on data volume. An SEO blog critiqued a page on my site, comparing it to ESPN's player page, and stated that my page was no different - except that if he actually looked at the data he might understand that my site had the player's complete career history whereas ESPN had only the current season's history. Since this person had no knowledge of hockey, he had no frame of reference to judge the data, so he defaulted back to "both pages have visually similar amounts of words on them so they must be the same".

I think I am starting to understand that Google views the web the same way, because it is an algorithm and algorithm's can't understand context that well.

Here's an example of the text from one of my pages:

http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=40170

===============
Ken Morden
Right Wing
Born Jun 2 1956 -- Oshawa, ONT
Height 5.11 -- Weight 160

Regular Season   Playoffs
Season  Team   Lge   GP  G   A   Pts PIM   +/-  GP   G   A   Pts  PIM
1973-74 Oshawa Generals   OHA   57 14 7   21  29        
1974-75  Oshawa Generals   OHA   50  15  11   26 9        
1975-76  Oshawa Generals   OHA   38  8   17   25  19     

Embed Ken Morden stats! | View as text
================

That's it - 68 words because this player had a short career, but his career is accurate and complete.

This page is returned on page #3 of the SERPs, behind a lot of irrelevant pages, and is likely pushed down so far due to Panda effects. The #1 SERP (a site that took data from me) is likely getting their spot by keyword stuffing the name "Ken Morden", since he has text such as "Add Ken Morden to my Love List", "Add Ken Morden to my Hate List", "Put Ken Morden in a Weight Class", "Add Ken Morden to Team Roster (2012-2013)", etc.

So my question is, how do I adapt to Google's article-centric view of the world? I know that Google says "don't design your pages to be read by search engines", but when Google won't return the page in the SERPs because it doesn't conform to Google's article-centric view of the world, I have to do something.

Keep in mind that I am not Wikipedia, and I can't write a 400 word article on each of 150,000 players on my site. That is not my site's focus either. People come to my site for the comprehensive data, and they like how it is presented. Wikipedia actually cheats in this area by translating data into words, often data they get from my site. They might have a page that says:

================
Ken Morden was a hockey player who played from 1973 to 1976. He was born on June 2, 1956 in Oshawa Ontario. Morden was a Right Wing who checked in at 5 feet 11 inches and weighed 160 pounds. He shot right-handed.

Morden spent his entire career playing in the Ontario Hockey Association. In 1973-74, his first season, he played in 57 games, scoring 14 goals and 7 assists for 21 points, with 29 penalty minutes. In 1974-75, he played in 50 games, scoring 15 goals and 11 assists for 26 points, with 9 penalty minutes. In 1975-76, his last season, he played in 38 games, scoring 8 goals an 17 assists, with 19 penalty minutes. Over his career, he played in 145 games, scoring 37 goals, 35 assists, for 72 points, with 57 minutes of penalties.

He did not appear in the playoffs during his three seasons in the OHA.
================

That's 151 words. My visitors don't want that. They want the table with the data.

So how should I adapt? Should I add more marginally-related text to my page to compensate for the lack of words within the data? For example, I could add a whole section to each player page of "Related Data". I could have a blurb that says "Here is a list of players with names that are similar to Ken Morden, perhaps you were looking or a different player." I could have a blurb that says "here are links (internal) to teams that Ken Morden played on. This will allow you to get more information about those teams". I could have a list of players that have the same birthdate as Ken Morden. I could have a list of links (internal) to the leagues that he played in. I could have a list of other players from Oshawa Ontario.

In other words, I could add marginally-related content to the page so that it is not penalized by Google for having a low amount of "words".

What do people think? I know that it is something I would do in a world without Google, but since Google is now penalizing me otherwise by burying my pages (this primarily isn't about competing with other sites, it is about being buried), what other choices do I have to bridge the gap between a data-centric site and the article-centric Google?



Re: Data-centric pages suffer with article-centric Google StevieD_Web 7/9/12 11:39 AM
>From what I've been reading, Google wants pages that have over 400 words 

Ralph, did you read that comment on a Google webmaster guideline?

Nope?  I thought so.  Google doesn't care about word counts, Google cares about useful content meeting the needs of their customers (searchers).  Personally I have pages with fewer than 100 words, I have other pages that far exceed 400 words by an order of magnitude.  Both pages are indexed and both pages respond to appropriate queries.
Free2Write 7/9/12 11:52 AM <This message has been deleted.>
Re: Data-centric pages suffer with article-centric Google Suzanneh 7/9/12 12:05 PM
I have to agree with StevieD.  Some of my top pages have way less than 400 words yet they rank well because they provide what the user is looking for.  My site is 10 years old, and I only started adding in articles in 2007.  Out of my top 10 landing pages, only 5 are "articles".

Suzanne
Re: Data-centric pages suffer with article-centric Google RalphSlate 7/9/12 12:25 PM
In absence of word-counting in the algorithm, can you offer a suggestion as to why the page I linked to does not return until page 3 for the terms [Ken Morden hockey] appearing only after clearly irrelevant results?  Notice that even using that very precise term, Google is serving results that do not even have "Ken" and "Morden" in close proximity to each other - on page 2 a result is returned for a hockey team in Morden, California whose coach is named Ken Wiebe.

Can you offer a suggestion as to why I am seeing a pattern where the pages on my site for the players who had brief careers are being buried in the search results, usually below similarly irrelevant content? This seems to be the traffic that I no longer get from Google - I am still being ranked for players that Google understand are famous hockey players - I'm losing out on the lesser known players who have short careers. This isn't a case of me being beaten by competitors - it is a case of the pages being buried.

Can you offer a suggestion as to why a search for [Mike Vitale hockey] returns a page on USCHO.com as the #2 result but does not return my page at all, even though we offer similar information on that player? Can you find any structural differences between their page and mine that might explain this? The primary difference I noticed is that the USCHO page has a lot more "words" on it - in the form of semi-related links which add about 200 more words to the page. 

Check out the pages, side by side:

http://www.uscho.com/stats/player/mid,15442/mike-vitale/

http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=146971

Isn't your first reaction that hockeydb is "thinner" than USCHO - despite that the directly relevant content between the two pages are similar?

Can you offer me a metric that Google uses to determine "useful content meeting the needs of their customers"?

Can you help me understand why most people offer advice such as "your page is too thin" when I offer up a page like this for critique - even though the page offers a comprehensive data profile on this player? If people aren't looking at the volume of words (ignoring the volume of data), then what are they looking at when they deem it "thin"?

Thanks,

Ralph


Free2Write 7/9/12 1:18 PM <This message has been deleted.>
Re: Data-centric pages suffer with article-centric Google StevieD_Web 7/9/12 1:32 PM
I think you are grasping at straws but I took a look at the pages.

both sites have excessive ad blocks and ads above the folds.  That said, your top navigation block is more negatively affected the by ad at the top of the page than the other site (harder to find your nav line versus the ad.... which could be a user experience issue).  

The content (data) presented on the other site is far more extensive than your own as the site includes game stats (and comments) where yours doesn't


I don't believe it is that your site is thinner, I just think the other site has done a far better job with the data than you  In fact the other site even includes a team logo in the area reserved for the data while you have ANOTHER ad block in the same slot.


By the way.....  baaeba60=e2cdfdb1&pid=146971 is nofollowed from your site, but others could link to the page... thus allowing Google access to the page.... and it duplicates your content.






Re: Data-centric pages suffer with article-centric Google RalphSlate 7/9/12 1:41 PM
I am focusing on words and keywords because when I look at my pages and compare those that rank well versus those that don't, and when I compare those that don't rank well to competitor pages with similar relevant data, the number of words (and also keywords, but I don't want to blatantly stuff the way other sites have) on the page seems to be the differentiating factor.

My premise is that Google doesn't understand the context of data as well as a human, so if someone searches for [Joe Milo hockey], and it finds both my page and the player page at DropYourGloves.com, it ranks the latter as #1 because they have "Joe Milo" stuffed 7 times on their site whereas the name appears just twice on my page.

Also, compare to USCHO.com; it ranks that site as the #5 result. My premise is that the 200+ words of additional information on the right of the page lifts that site up.

If you look at all three pages, my site has the most complete data on this hockey player. DropYourGloves.com has 3 lines of stats, USCHO.com has four lines of stats, and hockeydb.com has nine lines of stats.

Algorithms are not people, so they have to use metrics to simulate "experience". They can't comprehend the data, they can just rank it.
Re: Data-centric pages suffer with article-centric Google RalphSlate 7/9/12 1:52 PM
Stevie_D, you're illustrating my point for me. As someone who likely isn't in tune with hockey, you can only look at the page using your perspective. You're looking at the data from a volume perspective, not from a hockey perspective. The "Game by game" is extraneous information. It's not useful to 99% of the population who look for information on a player. You don't know that though because you're not living the hockey world.

You also missed that I have a more complete profile of this player because I offer his full career while USCHO only focuses on his college career. A better example might be a player named Joe Milo. Compare the two sites:

http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=73803

http://www.uscho.com/stats/player/mid,12817/joe-milo/

Now one obvious difference is that USCHO has search-friendly URLs (whatever that's worth...). But beyond that, USCHO has four lines of this player's career. I have all nine.

That is the point of my premise - that Google is not living the hockey culture, so they don't know which page is "more useful" or "more complete". Only search users know that. But Google is burying my page - below junk, so people don't know about it. It's not a competition if my page isn't returned. My quest is to know why that is the case. If I can solve that problem, I have no problem competing on merit.
Free2Write 7/9/12 2:09 PM <This message has been deleted.>
Re: Data-centric pages suffer with article-centric Google RalphSlate 7/10/12 7:29 AM
Here is an example of what I am thinking of doing.

Old page:

http://www.hockeydb.com/ihdb/stats/pdisplay.php?pid=138827

New page:

http://www.hockeydbdev.com/ihdb/stats/pdisplay.php?pid=138827

Doesn't the second page appear more useful and less thin than the first?

This is smart for another reason too - as I expand my database into more and more leagues, the number of players with the same name is growing. Google is not going to return the 13 pages on my site devoted to 13 different players named Steve Smith. If I'm lucky they will return the top one, or if someone puts in a more precise search, the correct one. But people aren't always going to get the one they want. When people search directly on my site, they are presented with a list of players; when they come in from another path, they will now have a similar view.
Free2Write 7/10/12 7:51 AM <This message has been deleted.>
Re: Data-centric pages suffer with article-centric Google RalphSlate 7/10/12 8:28 AM
I think you may be oversensitive to advertising. The Facebook block is not "flashing, flashing, rotating, flashing, flashing". It's a muted animated GIF with 3 frames, with a frame time of 3-3-5 seconds, and only about 10% of the image changes on each of the frames, to allow more words to be shown . No flashing, no rotating.

Also, the page in question gets about 120k views per day. I think that warrants the obtrusion.
Re: Data-centric pages suffer with article-centric Google Free2Write 7/10/12 8:31 AM
Why ask if you've already made up your mind.

Re: Data-centric pages suffer with article-centric Google RalphSlate 7/10/12 8:43 AM
I didn't ask about the advertising. Advertising is like taxes - if you ask people how much, they will always say "less".
Re: Data-centric pages suffer with article-centric Google Suzanneh 7/10/12 8:57 AM
>>I know that it is something I would do in a world without Google

This ^^ says it all.  Personally, I like the second page and it seems to make sense to me.  I think I got the impression from your initial post that you thought this was a bad idea -- or that you had to write a lot of content (e.g. have more than 400 words).

As an aside:  Is there a reason why your home page is centered and the other pages I visit are to the left of the screen?  It threw me off, visually.

The ad doesn't bother me at all.  Used to it and understand why it's there. :-)

Suzanne
Re: Data-centric pages suffer with article-centric Google RalphSlate 7/10/12 9:13 AM
Suzanneh, I mis-typed. I meant to say that I would not do it in a world without Google. I feel that people, especially hockey fans, can correctly assess the context of the data on my site. The information is more or less redundant - yes, it is culled out and grouped a bit more, but to be honest, the main reason I would think about doing it is to give Search Engines more content to digest about the particular page, since the pure data format seems to be resulting in a penalty against my page.

It is similar to someone with an image gallery describing the image in great detail using text. "Here is a photograph of a golden retriever sitting on a lawn, with a ball next to him. The season is the fall, and the sky is blue". Someone not visually impaired arriving at the page already knows that because they can see the photo. The text would be for search engines.

Thanks for the feedback everyone. What do people think about the use of the player's name in each of the "Related info" sections? Keyword stuffing?
Re: Data-centric pages suffer with article-centric Google Lysis 7/10/12 9:14 AM
Regarding word count, Matt Cutts mentions in his "Is article posting against guidelines" video that articles are just there for backlinks, but an interesting comment from him was "the articles are always written to word count (paraphrasing)", so I would be careful adhering to a specific word count. That could be detected by algorithms. If every page on your site is 400 or 500 words, you'd think someone has something better to say that is more succinct in 200 words. :)
Re: Data-centric pages suffer with article-centric Google KORPG Kevin 7/10/12 9:18 AM
I'd read this article as I think it applies almost 1:1 to your situation.

http://www.stonetemple.com/matt-cutts-and-eric-talk-about-what-makes-a-quality-site/
Re: Data-centric pages suffer with article-centric Google StevieD_Web 7/10/12 11:02 AM
>As someone who likely isn't in tune with hockey, you can only look at the page using your perspective. You're looking at the data from a volume perspective, not from a hockey perspective. The "Game by game" is extraneous information. It's not useful to 99% of the population who look for information on a player. You don't know that though because you're not living the hockey world.  

Maybe my background as a baseball & football fan and wanting stats, has distorted my perspective of your hockey site.... but I don't think so.... sports is sports and the passion I bring to my sports should translate to your sport as well.  While I may not be a hockey fan per se, I can honestly say that one of the finest sporting events I have observed was this little masterpiece:


and to dismiss my opinions because I am not a true fan of the sport is inappropriate.


Re: Data-centric pages suffer with article-centric Google B.Z. 7/10/12 11:42 AM
Lysis, do you happen to have a link to this video?
Re: Data-centric pages suffer with article-centric Google B.Z. 7/10/12 11:46 AM
I think you're adding value by having the disambiguation and league links on the right. But I'd consider either unlinking the team names in the stats on the left side or ditching the team information section on the right, since as it is now there are two places to go to get more team information.
Re: Data-centric pages suffer with article-centric Google Brian Ussery 7/10/12 12:12 PM
RalphState,

You can use lots of words in a page and say very little.  It is not so much the word count as much it is what those words say and how much utility a page provides to users.  For example, uscho provides more utility to users because it provides game by game stats for ken morden's hockey career.  It is possible that someone searching for [ken morden hockey] might want to know this players previous hockey team and hometown no?  USCHO provides that whereas I don't see it at DB.  Also, when comparing these two pages with http://browsersize.googlelabs.com/ DB has more ad content above the fold as seen by 90 of users. Google uses a layout algorithm and could compare results to determine a winner.

-Brian



Re: Data-centric pages suffer with article-centric Google RalphSlate 7/10/12 8:11 PM
Brian --

USCHO provides game by game stats only for the college experience of recent college hockey players. They do not have any stats - summary or game-by-game - for any of a player's experience outside of his four college years. I would say that if someone is interested in a player's statistical history, given the choice between seeing a profile of just that player's four college years with a tabulation of how many points the player scored in each game versus a choice between a player's complete history, up to date to the current date while in-season, the choice should be obvious. 

USCHO merely mentions the player's previous team. I not only have the player's stats for that team, I have the stats for the four teams prior to that. I also have the player's hometown, offering it in a more compact format after his birthdate. 

USCHO is a great site for college hockey news (I went to college with one of the founders). They are not a go-to site for statistical profiles. Google should let users decide instead of burying my site for some unspecified reason. I am confident that between my site and USCHO, or "dropyourgloves.com" - which has God-awful layout and similarly incomplete information - people will choose my site. I just want Google to stop burying my pages.