How to Make Web Crawlers Happy

Banner image for web crawlers blog post

A hospital website is by its very nature a large and ever changing organism. Even worse, there are two dramatically different audiences your website must engage: people and web crawlers. If your website design neglects web crawlers, search engines will take their revenge and no one will be able to find your content. Fortunately, there steps we take – and you can too — to keep those crawlers happy.

It turns out search engine web crawlers are lazy. They spend a lot of time crawling through complex sites, some of which don’t provide a map, trying to understand where the canonical content is and whether content is secure. Here at Geonetric, we do everything we can to make it easy for them so your content is indexed more frequently and accurately. Here’s how.

Define the canonical URLs
It’s possible to have the same content in multiple places on a site. This happens frequently with provider information. For example, Dr. Smith has multiple profiles because she serves multiple communities. Also the URL to her profile likely includes her name (called a slug), instead of a random ID number, so visitors can find it easily. These approaches are helpful to site visitor but confusing to lazy web crawlers, which prefer a single page and URL for Dr. Smith. To make the web crawler’s job easier on Geonetric client sites, we define a canonical, or preferred, page for indexing, which typically uses the slug as the URL.

Identify secure pages
There are some pages on your site that should always be secured through an SSL certificate to protect visitors’ data. There are even indications that search engines, like Google, may start favoring pages or even entire sites that are under an SSL certificate. But the only thing web crawlers care about is knowing if the page is http (unsecure) or https (secure) before they wander onto it. VitalSite, Geonetric’s content management system, makes this clear by adding a ‘require SSL’ option. This is automatically turned on if the entire site is SSL or a page includes an element – like a form – that requires protection. You can also manually turn on this security. The https will appear correctly in the site map and the canonical URL to help the web crawler know what to expect.

Add a recently changed pages RSS feed
A sitemap helps the web crawler find its way through your site. VitalSite automatically creates a sitemap. In addition, VitalSite allows you to add a RSS feed. This feed provides information on new pages and pages that have recently changed so web crawlers know where to spend their time when updating the index. (If you want to learn more about site maps, http://www.sitemaps.org/ is a great resource.)

That’s what we do. But if you want to kick it up a notch and help web crawlers, there are a few things you can do now:

  • Accurately mark pages to require an SSL (if you’re a client of ours, do this in VitalSite).
  • Hide pages from the site map that do not need to be indexed. Note that sitemaps bigger than 50,000 pages require a sitemap index.
  • Create your recently changed pages RSS feed per domain on your site.
  • Submit your sitemap and recently changed pages RSS feed to the search engines using webmaster tools.
  • Talk to Geonetric about other things you can do to improve your site’s SEO with an audit or some SEO tender love and care.

Keeping the web crawler happy is important. Be found.

Plusone Twitter Facebook Email Stumbleupon Pinterest Linkedin Digg Delicious Reddit
This entry was posted in Admin Feed, Agile, Form Builder, Geo.com Homepage Panel by Jennie Ocken. Bookmark the permalink.

About Jennie Ocken

Jennie gets product development. With more than 10 years of experience in marketing, engineering and client services at a company that worked exclusively with Fortune 1000 OEM companies, she knows how to identify and prioritize customer needs. And she speaks just enough geek to be accepted by software developers. Skilled at both cross-team communication and agile project management, Jennie leads Geonetric’s engineering team, helping them prioritize, test, document and deliver new features and functionality. She holds a master’s degree in business administration from the University of Iowa and a bachelor’s degree with a double major in creative writing and theatre from Knox College. When she’s not team building or wireframing, this world traveler can be found planning her next adventure, perfecting her Joong Bong Sul moves, or working on her latest science fiction novel.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.