Official Google Webmaster Central Blog

Rabu, 25 April 2012

Official Google Webmaster Central Blog

Link to Google Webmaster Central Blog

1000 Words About Images

Posted: 25 Apr 2012 01:08 AM PDT

Webmaster level: All

Creativity is an important aspect of our lives and can enrich nearly everything we do. Say I'd like to make my teammate a cup of cool-looking coffee, but my creative batteries are empty; this would be (and is!) one of the many times when I look for inspiration on Google Images.

The images you see in our search results come from publishers of all sizes — bloggers, media outlets, stock photo sites — who have embedded these images in their HTML pages. Google can index image types formatted as BMP, GIF, JPEG, PNG and WebP, as well as SVG.

But how does Google know that the images are about coffee and not about tea? When our algorithms index images, they look at the textual content on the page the image was found on to learn more about the image. We also look at the page's title and its body; we might also learn more from the image's filename, anchor text that points to it, and its "alt text;" we may use computer vision to learn more about the image and may also use the caption provided in the Image Sitemap if that text also exists on the page.

 To help us index your images, make sure that:
  • we can crawl both the HTML page the image is embedded in, and the image itself;
  • the image is in one of our supported formats: BMP, GIF, JPEG, PNG, WebP or SVG.
Additionally, we recommend:
  • that the image filename is related to the image's content;
  • that the alt attribute of the image describes the image in a human-friendly way;
  • and finally, it also helps if the HTML page's textual contents as well as the text near the image are related to the image.
Now some answers to questions we've seen many times:

Q: Why do I sometimes see Googlebot crawling my images, rather than Googlebot-Image?
A: Generally this happens when it's not clear that a URL will lead to an image, so we crawl the URL with Googlebot first. If we find the URL leads to an image, we'll usually revisit with Googlebot-Image. Because of this, it's generally a good idea to allow crawling of your images and pages by both Googlebot and Googlebot-Image.

Q: Is it true that there's a maximum file size for the images?
A: We're happy to index images of any size; there's no file size restriction.

Q: What happens to the EXIF, XMP and other metadata my images contain?
A: We may use any information we find to help our users find what they're looking for more easily. Additionally, information like EXIF data may be displayed in the right-hand sidebar of the interstitial page that appears when you click on an image.

Q: Should I really submit an Image Sitemap? What are the benefits?
A: Yes! Image Sitemaps help us learn about your new images and may also help us learn what the images are about.

Q: I'm using a CDN to host my images; how can I still use an Image Sitemap?
A: Cross-domain restrictions apply only to the Sitemaps' tag. In Image Sitemaps, the tag is allowed to point to a URL on another domain, so using a CDN for your images is fine. We also encourage you to verify the CDN's domain name in Webmaster Tools so that we can inform you of any crawl errors that we might find.

Q: Is it a problem if my images can be found on multiple domains or subdomains I own — for example, CDNs or related sites?
A: Generally, the best practice is to have only one copy of any type of content. If you're duplicating your images across multiple hostnames, our algorithms may pick one copy as the canonical copy of the image, which may not be your preferred version. This can also lead to slower crawling and indexing of your images.

Q: We sometimes see the original source of an image ranked lower than other sources; why is this?
A: Keep in mind that we use the textual content of a page when determining the context of an image. For example, if the original source is a page from an image gallery that has very little text, it can happen that a page with more textual context is chosen to be shown in search. If you feel you've identified very bad search results for a particular query, feel free to use the feedback link below the search results or to share your example in our Webmaster Help Forum.


Our algorithms use a great variety of signals to decide whether an image — or a whole page, if we're talking about Web Search — should be filtered from the results when the user's SafeSearch filter is turned on. In the case of images some of these signals are generated using computer vision, but the SafeSearch algorithms also look at simpler things such as where the image was used previously and the context in which the image was used. 
One of the strongest signals, however, is self-marked adult pages. We recommend that webmasters who publish adult content mark up their pages with one of the following meta tags:
 <meta name="rating" content="adult" /> <meta name="rating" content="RTA-5042-1996-1400-1577-RTA" /> 
Many users prefer not to have adult content included in their search results (especially if kids use the same computer). When a webmaster provides one of these meta tags, it helps to provide a better user experience because users don't see results which they don't want to or expect to see. 

As with all algorithms, sometimes it may happen that SafeSearch filters content inadvertently. If you think your images or pages are mistakenly being filtered by SafeSearch, please let us know using the following form

If you need more information about how we index images, please check out the section of our Help Center dedicated to images, read our SEO Starter Guide which contains lots of useful information, and if you have more questions please post them in the Webmaster Help Forum

How to move your content to a new location

Posted: 24 Apr 2012 11:38 PM PDT

Webmaster level: Intermediate

While maintaining a website, webmasters may decide to move the whole website or parts of it to a new location. For example, you might move content from a subdirectory to a subdomain, or to a completely new domain. Changing the location of your content can involve a bit of effort, but it's worth doing it properly.

To help search engines understand your new site structure better and make your site more user-friendly, make sure to follow these guidelines:
  • It's important to redirect all users and bots that visit your old content location to the new content location using 301 redirects. To highlight the relationship between the two locations, make sure that each old URL points to the new URL that hosts similar content. If you're unable to use 301 redirects, you may want to consider using cross domain canonicals for search engines instead.
  • Check that you have both the new and the old location verified in the same Google Webmaster Tools account.
  • Make sure to check if the new location is crawlable by Googlebot using the Fetch as Googlebot feature. It's important to make sure Google can actually access your content in the new location. Also make sure that the old URLs are not blocked by a robots.txt disallow directive, so that the redirect or rel=canonical can be found.
  • If you're moving your content to an entirely new domain, use the Change of address option under Site configuration in Google Webmaster Tools to let us know about the change.
Change of address option in Google Webmaster Tools
Tell us about moving your content via Google Webmaster Tools
  • If you've also changed your site's URL structure, make sure that it's possible to navigate it without running into 404 error pages. Google Webmaster Tools may prove useful in investigating potentially broken links. Just look for Diagnostics > Crawl errors for your new site.
  • Check your Sitemap and verify that it's up to date.
  • Once you've set up your 301 redirects, you can keep an eye on users to your 404 error pages to check that users are being redirected to new pages, and not accidentally ending up on broken URLs. When a user comes to a 404 error page on your site, try to identify which URL they were trying to access, why this user was not redirected to the new location of your content, and then make changes to your 301 redirect rules as appropriate.
  • Have a look at the Links to your site in Google Webmaster Tools and inform the important sites that link to your content about your new location.
  • If your site's content is specific to a particular region you may want to double check the geotargeting preferences for your new site structure in Google Webmaster Tools.
  • As a general rule of thumb, try to avoid running two crawlable sites with completely or largely identical content without a 301 redirection or specifying a rel="canonical"
  • Lastly, we recommend not implementing other major changes when you're moving your content to a new location, like large-scale content, URL structure, or navigational updates. Changing too much at once may confuse users and search engines.
We hope you find these suggestions useful. If you happen to have further questions on how to move your content to a new location we'd like to encourage you to drop by our Google Webmaster Help Forum and seek advice from expert webmasters.

Written by Fili Wiese (Ad Traffic Quality) & Kaspar Szymanski (Search Quality)

Webmaster Tools spring cleaning

Posted: 24 Apr 2012 04:06 PM PDT

Webmaster level: All

Webmaster Tools added lots of new functionality over the past year, such as improvements to Sitemaps and Crawl errors, as well as the new User Administration feature. In recent weeks, we also updated the look & feel of our user interface to match Google's new style. In order to keep bringing you improvements, we occasionally review each of our features to see if they're still useful in comparison to the maintenance and support they require. As a result of our latest round of spring cleaning, we'll be removing the Subscriber stats feature, the Create robots.txt tool, and the Site performance feature in the next two weeks.

Subscriber stats reports the number of subscribers to a site's RSS or Atom feeds. This functionality is currently provided in Feedburner, another Google product which offers its own subscriber stats as well as other cool features specifically geared for feeds of all types. If you are looking for a replacement to Subscriber stats in Webmaster Tools, check out Feedburner.

The Create robots.txt tool provides a way to generate robots.txt files for the purpose of blocking specific parts of a site from being crawled by Googlebot. This feature has very low usage, so we've decided to remove it from Webmaster Tools. While many websites don't even need a robots.txt file, if you feel that you do need one, it's easy to make one yourself in a text editor or use one of the many other tools available on the web for generating robots.txt files.

Site performance is a Webmaster Tools Labs feature that provides information about the average load time of your site's pages. This feature is also being removed due to low usage. Now you might have heard our announcement from a couple of years ago that the latency of a site's pages is a factor in our search ranking algorithms. This is still true, and you can analyze your site's performance using the Site Speed feature in Google Analytics or using Google's PageSpeed online. There are also many other site performance analysis tools available like WebPageTest and the YSlow browser plugin.

If you have questions or comments about these changes please post them in our Help Forum.

Another step to reward high-quality sites

Posted: 24 Apr 2012 02:46 PM PDT

Webmaster level: All

Google has said before that search engine optimization, or SEO, can be positive and constructive—and we're not the only ones. Effective search engine optimization can make a site more crawlable and make individual pages more accessible and easier to find. Search engine optimization includes things as simple as keyword research to ensure that the right words are on the page, not just industry jargon that normal people will never type.

"White hat" search engine optimizers often improve the usability of a site, help create great content, or make sites faster, which is good for both users and search engines. Good search engine optimization can also mean good marketing: thinking about creative ways to make a site more compelling, which can help with search engines as well as social media. The net result of making a great site is often greater awareness of that site on the web, which can translate into more people linking to or visiting a site.

The opposite of "white hat" SEO is something called "black hat webspam" (we say "webspam" to distinguish it from email spam). In the pursuit of higher rankings or traffic, a few sites use techniques that don't benefit users, where the intent is to look for shortcuts or loopholes that would rank pages higher than they deserve to be to be ranked. We see all sorts of webspam techniques every day, from keyword stuffing to link schemes that attempt to propel sites higher in rankings.

The goal of many of our ranking changes is to help searchers find sites that provide a great user experience and fulfill their information needs. We also want the "good guys" making great sites for users, not just algorithms, to see their effort rewarded. To that end we've launched Panda changes that successfully returned higher-quality sites in search results. And earlier this year we launched a page layout algorithm that reduces rankings for sites that don't make much content available "above the fold."

In the next few days, we're launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google's existing quality guidelines. We've always targeted webspam in our rankings, and this algorithm represents another improvement in our efforts to reduce webspam and promote high quality content. While we can't divulge specific signals because we don't want to give people a way to game our search results and worsen the experience for users, our advice for webmasters is to focus on creating high quality sites that create a good user experience and employ white hat SEO methods instead of engaging in aggressive webspam tactics.

Here's an example of a webspam tactic like keyword stuffing taken from a site that will be affected by this change:

Of course, most sites affected by this change aren't so blatant. Here's an example of a site with unusual linking patterns that is also affected by this change. Notice that if you try to read the text aloud you'll discover that the outgoing links are completely unrelated to the actual content, and in fact the page text has been "spun" beyond recognition:

Sites affected by this change might not be easily recognizable as spamming without deep analysis or expertise, but the common thread is that these sites are doing much more than white hat SEO; we believe they are engaging in webspam tactics to manipulate search engine rankings.

The change will go live for all languages at the same time. For context, the initial Panda change affected about 12% of queries to a significant degree; this algorithm affects about 3.1% of queries in English to a degree that a regular user might notice. The change affects roughly 3% of queries in languages such as German, Chinese, and Arabic, but the impact is higher in more heavily-spammed languages. For example, 5% of Polish queries change to a degree that a regular user might notice.

We want people doing white hat search engine optimization (or even no search engine optimization at all) to be free to focus on creating amazing, compelling web sites. As always, we'll keep our ears open for feedback on ways to iterate and improve our ranking algorithms toward that goal.

0 komentar:

Posting Komentar