Search engines are limited in how they crawl the web and interpret content. A webpage doesn't always look the same to you and I as it looks to a search engine. In this section, we'll focus on specific technical aspects of building (or modifying) web pages so they are structured for both search engines and human visitors alike. This is an excellent part of the guide to share with your programmers, information architects, and designers, so that all parties involved in a site's construction can plan and develop a search-engine friendly site.
In order to be listed in the search engines, your most important content should be in HTML text format. Images, Flash files, Java applets, and other non-text content are often ignored or devalued by search engine spiders, despite advances in crawling technology. The easiest way to ensure that the words and phrases you display to your visitors are visible to search engines is to place it in the HTML text on the page. However, more advanced methods are available for those who demand greater formatting or visual display styles:
- Images in gif, jpg, or png format can be assigned “alt attributes” in HTML, providing search engines a text description of the visual content.
- Search boxes can be supplemented with navigation and crawlable links.
- Flash or Java plug-in contained content can be supplemented with text on the page.
- Video & audio content should have an accompanying transcript if the words and phrases used are meant to be indexed by the engines.
Seeing Like a Search Engine
Many websites have significant problems with indexable content, so double-checking is worthwhile. By using tools like Google's cache, SEO-browser.com, or the MozBar you can see what elements of your content are visible and indexable to the engines. Take a look at Google's text cache of this page you are reading now. See how different it looks?
Whoa! That's what we look like?
Using the Google cache feature, we're able to see that to a search engine, JugglingPandas.com's homepage doesn't contain all the rich information that we see. This makes it difficult for search engines to interpret relevancy.
"I'm totally going to check out my Axe Batting Monkeys Blog!"
That’s a lot of monkeys, and just headline text?
Hey, where did the fun go?
Uh oh... via Google cache, we can see that the page is a barren wasteland. There's not even text telling us that the page contains the Axe Battling Monkeys. The site is entirely built in Flash, but sadly, this means that search engines cannot index any of the text content, or even the links to the individual games. Without any HTML text, this page would have a very hard time ranking in search results.
It's wise to not only check for text content but to also use SEO tools to double-check that the pages you're building are visible to the engines. This applies to your images, and as we see below, your links as well.