WebProNews sat down with Derrick Wheeler, Senior SEO Architect at Microsoft to talk about large scale SEO and how Microsoft manages SEO across it’s huge multi-business, global website.
Chances are you don’t have a site that matches the size of what Microsoft has, but in the age of realtime user-generated content, there is a whole lot of content going up on the web. Wheeler (and ultimately Microsoft’s) strategy deals with "mega" sites.
"It’s a large complicated website where the content is generated by multiple business units in many different countries in many different languages, and you’re trying to get things done within a complex, large organization, where there’s just a lot of dependencies – a lot of stakeholders – a lot of different interests," explains Wheeler.
"A lot of people talk about ‘content is king. content is king,’ says Wheeler. "With ‘mega SEO,’ structure is king because without structure, your content won’t even be discovered."
I’d still recommend producing great content, but when you’re talking about a site the size of Microsoft.com, Wheeler has a point.
"Some of the situations with our site, Microsoft.com…we’ll have one million pages of navigation to get to fourteen thousand pages of content, and the way that you get to that content determines the URL of the final landing page, so every final landing page of content will have however many different ways there are of getting to it duplicated, so you know, you’ve got like twenty million URLs just for fourteen thousand pages," he says. "So a lot of mega SEO is about crawl efficiency – making your site more efficient for crawling and indexing."
"One of the things that we deal with are the crawler efficiencies – things like large scale duplicate content or just junk content – outdated content – content that’s been up for like five years, but the person that managed it left the company and no one took over so there’s just content sitting out there that engines have to index," he continues. "We don’t want that stuff surfacing. We want our new stuff, so getting rid of legacy content, trying to fix things at the platform level, so you don’t continue to make the same mistakes over and over and over or just build on the issues that you have with your existing content management system."
"But one of our challenges is we have multiple content management systems," he adds. "We’ve got one primary for one section, another section of the site might have two or three that they use. I mean it’s basically all over the board."
"I can’t just go in and fix the CMS and have everything magically fixed. We have to go in and prioritize what CMS we want to try to work with," he adds.
How do you deal with that? Turn to the IT guys of course.
"MSIT was involved with that – our IT department," says Wheeler. "They can tell when the content that hadn’t been updated in a certain amount of time and then they reached out to who were listed as the owners of that section and they contacted them and asked them if they still needed that content, and if there was no response in a certain amount of time, they would just remove it. And if they did respond then they would work out whether or not this content was still valid, and if it wasn’t then they all agreed that it would be removed."
"It was a lot of email chains that I was on," he adds. "Hundreds of emails back and fourth to get all this accomplished, and I think they removed probably a million, two million URLs from the site just by that one exercise."
Penalties? For Microsoft?
"A lot of these pages of content weren’t getting any traffic," Wheeler notes. "That was another way that we could tell that they were not really useful….We didn’t go in and manually map them to any other section of the site."
You might think search engines would penalize you for having 2 million URLs that go nowhere, but when you’re Microsoft, that’s not something you really need to worry about (and it’s not like Google would treat the competition unfairly).
"I don’t think an engine is going to dock us for having pages of content that were really old and not updated and removing them from our website, and the proper response for a page that no longer exists is the 404," says Wheeler. "I don’t think that they would penalize us for that. I’m pretty sure of it."
"We could’ve gone in probably and found some that were valuable and redirected them somewhere, but in general, our site has a lot of authority just because when we launch something, we get a ton of links," he says. "You know, people – bloggers are always talking about Microsoft and all the stuff that we’re doing. Our site in general has a lot of authority, so it wasn’t a big priority for us at the time."
For "mega" sites, this is probably the case a lot of times.
Small Strides For a Big Impact
When you’re talking about a site the size of Microsoft.com, there are other things besides irrelevant content that are likely to come into play. "That’s just one aspect of mega SEO," says Wheeler. "The other would be the international piece – it’s huge for us, because we have close to a hundred different countries and many different languages, and there’s 23 countries that we really focus a lot on, but our content – the way we publish it basically…for Australia, their content can be in a lot of different places scattered all over our website, and it’s hard for them to manage their SEO when their content’s spread all over the place."
"So one of the things we’ve tried to do is come up with a standard international URL policy, because without that, it’s hard for a country to even manage their own content," he says. "Event that’s been a battle because some of the content management systems that we publish on can’t conform to that structure so it’s just a constant….with mega SEO it’s about making small strides over time that [when] grouped together they have a really big impact."
Who’s in Charge of the Whole Site? It’s Just Ballmer.
"There’s so many different business groups and our website Microsoft.com doesn’t roll up to a single person until it gets to Steve Ballmer," says Wheeler. "As soon as you break off of Steve Ballmer, you’ve got someone else that’s responsible for MSDN TechNet. There’s another business group that’s responsible for the support site…so we don’t have a centralized authority that manages the entire Microsoft.com domain. So it’s very difficult because some businesses will make decisions on what’s in their best interest, and it might not really be what’s in the best interest of our site as a single domain name."
"The first thing I did was really try to draw an image (because I’m very visual) of what are all the pieces involved in order to optimize the site," says Wheeler of his approach. "And for us…there’s four levels of where the SEO occurs on the site, and to support those four levels, there’s a lot of what we’ll call workstreams or initiatives or focus areas that support those four levels."
SEO by Level
"The first level is the site-wide SEO," explains Wheeler. "That’s the crawl efficiency stuff we talked about. The next level is subsidiary level SEO, which is the international piece and working with them."
"The next is what we call site-specific so there might be an individual site on Microsoft.com – they want to do SEO…well we have three levels and they can do it themselves and we provide guidance, they can do a little bit with an agency (just have the agency do the keyword research, do some training…), or they can do a full service agency program," he continues. "And then there’s the people who say, ‘I want to optimize this page for this keyword’. Well, we’ll give them some generic advice like, ‘you should use that word on your page and you should actually think [about] more than just that page and on board to one of our site-specific programs.’"
How Do You Measure All of This?
"And then in support of that we have a standard measurement framework, because when I got there, there was a lot of different ways that people were measuring SEO," he says. "In fact, our site in general…half the site uses one web analytics application, the other half uses another, and some of them are tagged with both."
"Just getting all the metrics is a challenge," Wheeler adds. "And then we also have search technology that needs to scale for all four levels. We’ve got our own set of web crawlers set up to crawl our site to look for those big issues, and we can also crawl individual sites and tell them where their SEO problems are."
Gathering the Masses
"We work with agencies, vendors…anything we can to help us scale this out," says Wheeler. "We had a two-day Microsoft-only SEO Summit called XMS (which is SMX backwards), but stands for cross-Microsoft. We had over 560 attendees that were all Microsoft people. We had all Microsoft speakers, which I thought was incredible that one company could have 560 attendees to an internal SEO event, and the entire event cost seventeen thousand dollars for 560 people. Now if we sent them all to a conference like [PubCon], that would be like 560 thousand dollars plus travel."
"And it was Microsoft-specific…all of the content was targeted towards Microsoft websites and out of that, we found a lot of internal ‘SEO rock stars’ that I can start building relationships with, and they’re great evangelists across the company."
As you can see, "mega SEO" is no minor feat. How would you like to have Wheeler’s job? We also talked with Bill Hunt of Back Azimuth Consulting at PubCon about the challenges of big company SEO. These companies may get a lot of links and rankings, but it’s not exactly easy.