Mary's Blog

head_left_image

How to be search stupid: Working hard to hardly rank in Google with orphan and dead end pages

The first rule of understanding Google is that Google is not psychic.  It doesn't know where your website or new pages are unless it can find them by following a link from a site already in its index to your new page.  (yes, you can ping and submit a sitemap, but direct access from Google is not the same thing as a crawl!) So, now that you get someone else to link to you and Google has found your site, you need Google to crawl, index and begin to trust your site.  That means YOU need to link to both other pages on your website and link out to pages on other websites.

Think of Google as the ultimate game of 6 Degrees of Kevin Bacon.  Somehow, your website should be connected to other important websites (i.e. websites that Google trusts) in order for Google to discover and begin to trust your site!  One of the most critical things you should learn about SEO, is how to think like a search engine.  Most people never consider HOW Google crawls the web, and because they never consider HOW- they never assist it in doing so.  And that is one key to becoming a FOG (Friend of Google). 

How a search engine crawls the web

Basically, a search engine crawls the web to discover new pages by starting with the websites it already knows and trusts and following links on these trusted pages to other pages.  Search engines begin their crawl of the web with a list of completely trusted sites (i.e. sites that would never link to spam - we call these "seed sites") These seed sites are the MOST trusted sites in the search engine's index and where it will begin its crawl of the internet (essentially - seed sites are the Kevin Bacon of search - we all are somehow connected to them!)

how google crawls the web

Google crawls the Trusted Seed Sites first, then starts crawling all the sites these seed sites link to

Trusted Sites: There are a few very trusted domains out there.

Example: DOJ.gov, LLI.org

When these sites link out to other sites they link to:

  • 100% quality sites
  • 0% spam sites

This second level of sites is called -1 Trust Distance sites. (1 link away from a Trusted Seed site)

After the engine crawls these sites, it then goes out and crawls all the sites they link to

Example: Wall Street Journal, Harvard.edu

When these sites link out to other sites they link to:

  • 95% quality sites
  • .1% spam sites

This third level of sites is called -2 Trust Distance sites.  (2 links away from a Trusted Seed site)

After the engine crawls these sites, it then goes out and crawls all the sites they link to and so on and so on

When these sites link out to other sites they link to:

  • 70% quality sites
  • 10% spam sites

As you extrapolate that out to -3, -4 Trust Distance, you will see more and more spam sites being linked to and thus the value of the site loses credibility and Trust.

Resource: Trust Rank Algorithm

Read also: Use Trust Distance to avoid the Sandbox, Boost Rankings and Basically Spank the SERPs with your real estate blog

Why do you care about the inner workings of a search engine? 

Because when you write for your site, you need to consider how Google is going to keep moving through your site.  One of your jobs as a webmaster is to ASSIST a search engine to not just crawl you, but to crawl the rest of the web.  You want to make your pages comfortable and reliable for a search engine, giving them plenty of places to move on to other pages on your site and other pages off your site.  How do you do this?

  • 1. Avoid Creating Orphaned Pages

An orphan page is a page that is not linked to by any other page on your site or on the internet (i.e. that cannot be reached from anywhere on the site or any other page on the internet) and thus cannot be found by a search bot unless it is linked to externally.  Basically, this is a page that nobody can find, not even Google!  Why?  Because GOOGLE ISN'T PSYCHIC.

How to get your orphaned pages adopted by the Angelia Jolie of Search (Google)

EASY SOLUTION:  immediately after you write your post - go over to your AR blog and link back to your new post in an old article or use a directory or RSS submission tool to build at least one backlink to your new post.

  • 2. Avoid Creating Dead Pages

A dead-end page is the one that has no outgoing links, thus creating a "dead end" for a search engine. Dead pages are unnatural on the web (a web page should be connected to other pages) Most importantly, a dead page leaves both the robot and the visitor no choice but to abandon the site since they have no natural way to get off the page.

Read also: SEO terminology: dead pages

How to breathe new life into dead pages

Just make sure you place at least one link out to another page on the web and one link to another page within your own site.  I like to give readers textual or visual cues as to whether I am keeping them on my site by saying "read also" or sending them off the site by saying "resource."

Read also: Formula for a Successful Blog Post

NOTE: do not rely on sidebar or top menu bar navigation for links, Google is not always fond of navigation/site-wide links.  You want to make sure in the content of every post you include links out to other site and links into other pages on your own site.  Try to link to trusted (meaning Government or Educational or high PageRank sites on the internet- DO NOT CALL ME UP AND TELL ME "I DO LINK OUT TO OTHER SITES - I LINK TO MY WEBSITE FROM MY BLOG IN EVERY ARTICLE."  You need to link out to various, related resource sites that have authority with Google - it likes to see you trust other sites it trusts! - (For example, if you write a post n FHA loans - link out to hud.gov as a resource for readers and engines.)

TIP: Never rely on navigation to help Google "discover" your pages, always use links in the body of your posts and content. 
Related Posts

Top 5 On-page Search Engine Optimization Tips for Real Estate Blogs
Google is NOT the Evil Empire stealing content from real estate blogs
Real Estate Blog SEO Tip: How to Research Keywords
45% of all blogs sleep with the fishes within 3 months: how to keep yours alive and kicking
The Real Estate Blogger's Guide to SEO

38 commentsMary McKnight • August 25 2008 04:44AM

The case of the idiot web developer: how your designer can get you banned from Google in less than 3 weeks.

Back to the conversation about hidden text.  Recently I have seen 3 sites banned (completely de-indexed) from Google because a web developer thought they could artificially boost the SEO of a site by injecting hidden text into the template.  This is what I call, Case of the Idiot Web Developer.

What is hidden text? 

Hidden text is text you place on your website that is typically keyword rich, invisible to a user but evident to a search engine.  Hmmm... sounds a lot like spam, huh?  Well, that is exactly how Google sees it! SPAM!  And what happens to spammers in Google?  They get banned - i.e. de-indexed. 

Read also: Is Your Real Estate Blog Keyword Spamming Behind Your Back?

What are the most common ways you or your web developer can place hidden text in your website or blog?

  • 1. Make the text the same color as the background
  • 2. Set the CSS for the text to display:none
  • 3. Make the text tiny timey like a 1-3pt font

How do you check if your site is spamming and/or has hidden text?

Use This Tool to Check for Keyword Spamming:  spam detector tool

***FYI: CSS attributes with visibility set to none are very common today as more and more sites are CSS driven, this is not always indicative of spamming and should be ignored like the tools own disclaimer notes.  However, if the CSS attribute contains text - especially keyword rich text, IT IS SPAM!

Example of what a site spamming with hidden text and keywords

*Please note, this site has since been fixed!

What does Google do to sites with hidden text in them?

Hidden text is considered a spamdexing tactic And Google has long banished sites with hidden text from its index.  Back in the 90s it was common to inject hidden text into sites and see huge gains in search engines as this text could artificially boost the keyword density of a page and therefore make it seem extremely relevant to a user's search criteria.  But this tactic has been a no-no for years.  In fact, it has been

Case study of how an idiot web developer got one blog banned

The sad story of a Pasadena real estate blog:

In late July, Irina Netchaev received an email from Google about her blog.  Imagine her shock when she found that Google was going to de-index her for spamming with hidden text.  Well, Irina called up her web development company and asked what this was all about.  The receptionist initially told her that Google didn't send the email.

Pasadena real estate blog

This is the exact email Irina received from Google:

Dear site owner or webmaster of pasadenacarealestatehomes.com,

While we were indexing your webpages, we detected that some of your pages were using techniques that were outside our quality guidelines, which can be found here: http://www.google.com/webmasters/guidelines.html

In order to preserve the quality of our search engine, we have temporarily removed some webpages from our search results. Currently pages from pasadenacarealestatehomes.com are scheduled to be removed for at least 30 days.

Specifically, we detected the following practices on your webpages:

* The following hidden text on pasadenacarealestatehomes.com:

e.g. Pasadena, San Marino, Monterey Hills, San Gabriel, South Pasadena, Monterey
Hills, Arcadia, Alhambra, Altadena, Sierra Madre, Highland Park, Temple
City, Duarte, La Canada

We would prefer to have your pages in Google's index. If you wish to be reconsidered, please correct or remove all pages that are outside our quality guidelines. When you are ready, please visit:

https://www.google.com/webmasters/tools/reinclusion?hl=en

to learn more and request a reconsideration request.

Sincerely,
Google Search Quality Team

Sounds pretty official to me, what about you?  I'm thnking I must be a real detective because the Google.com email is what tipped me off to this email being the real deal (the subsequent deindexing was the clincher, though).  When Irina called her web development company, Develement, LLC, the receptionist made assertions that there wasn't spam in her site.  So, Irina called me.  A simple look inside the code showed obvious spam included with the following tag <H2><SITE:TAGLINE /></h2> which translated to a list of keywords being injected in an H2 tag that was set in the CSS to display:none. I took her index html file and made the change which was simply either taking out the tag alltogether or setting the CSS to display the text visibly.  I sent the file to Irina and she sent it to her development company - this was their response:

Hi Mary,

Just had a call from Tom Balletta who advised me that he will NOT upload the file that I sent to Nicole.  He said that it was a proprietary file worked by "someone" outside of his office and it can maliciously damage my site.  I pressed him on it and he still refused.

He claims that his programmers made the necessary fixes to my site and that he doesn't know why Google flagged it, but there were as he calls it "inappropriate" key words that were not relating the content.  He refused to tell me which keywords he was referring to and was quite arrogant - quite a character!

Not sure if you had a chance to take a look at the way the site looks now and if it's okay to resubmit. 

Let me know your thoughts.

Thank you,

Irina Netchaev

Now, of course, all these "programmers" (I take offense to people who only know graphics, HTML, CSS, Flash and just enough Javascript to be dangerous calling themselves "programmers" as any idiot with a computer can make a webpage, but that is another post for another day) did was change her title tag but never set the CSS to visible or removed the tag.  Not exactly a fix in Google's eyes.  Nevertheless, Irina's site was banned from Google.

This is what a site query of a banned website looks like:

Pasadena Real Estate Blog with 0 indexed pages

Did Irina's web company choose to help her after the ban?  NO.  They let her twist in the wind (continuing to charge her, of course, for their stellar service and expertise!)  while she watched her leads dwindle to 0.  So, Irina, came to me to get her site fixed.

How do you fix a banned site?

  • 1. You fix the offending code
  • 2. You join Google Webmaster Tools
  • 3. You request reinclusion
  • 4. Then you wait however long it takes for a human being at Google to review your site and re-include you. This can take several weeks
  • 5. At the same time you request reinclusion on the banned domain - you bring up a sister site alongside the banned domain that will be indexed. In Irina's case, we brought up another Pasadena site alongside the banned one. This site looks exactly the same as the old site with all the same content sans the offending code. You might recall when I used this same tactic for Cyndee Haydon when one of her sites experienced a Google penalty back in December. This is the key to getting your leads back in short order while you wait for Google to re-index the banned site.

To help Irina, link to her sister blog at www.pasadenaviews.com

Things to do while awaiting reinclusion:

  • 1. Add new content to the new clean domain
  • 2. Build 10-100 backlinks to the new domain to let Google know it is trusted (start with ActiveRain and Localism and your various social profiles, make sure you post all new content to Twitter)
  • 3. Force a site-wide ping of the new domain to put you back in Google's good graces.

 Things not to do while awaiting reiclusion:

  • 1. Do not redirect the banned domain to the new clean domain while awaiting reinclusion
  • 2. Do not add new content to the banned domain - it is a waste of time

So, while this post may have been harsh, it shows exactly how an idiot web developer can cause your site to get banned through no fault of yours and why you need to stay on top of your site structure and coding.  If you have a Dev Element blog, you need to start checking for spam tags, we have found that they have been injecting spam into their sites for at least the past 2 months.  My recommendation to these "programmers" - don't play with SEO if you don't know what you are doing - leave that to the big kids - go back to your box of crayola's and draw! While you are playing in the sandbox with your fingerpaints, why don't you look up how to read obfuscated JavaScript, I hear you also can't do that!  FYI - real "programmers" can.  Want a lesson?  I'd be happy to put together a remedial class, but be sure to wer your helmet - the short bus ride to the tard factory can be bumpy.

 Yes, Devidiots, I mean Develement... I'm pissed that you caused good bloggers this kind of heartache, took their money and refused to help when you had the chance.  You can't even clean up your own mess.  I think the industry term for that is irresponsible parasite.

Related Posts

Stop Word List
Google Slaps Real Estate Blogs in Latest PageRank Update
Google drops blogs. Are you a victim?
Is the Blogroll on Your Real Estate Blog Damaging Your Street Cred?
The Real Estate Blogger's Guide to SEO

48 commentsMary McKnight • August 15 2008 11:39AM