Update: Stay tuned to live.dev.webpronews.com for an exclusive interview ith Peter Norvig, Google’s Director or Research.
Google’s director of research Peter Norvig is talking at SMX West at 9:15 AM PST / 12:15 PM EST, and I will be liveblogging it below. Watch the video live at our new site live.dev.webpronews.com (it will also be available there archived later if you don’t have time to watch it as it happens).
Liveblogging begins…
12:09 : Should start in a few minutes…please forgive any typos as they come 🙂
12:15 Danny Sullivan says keynote will be in just a moment.
12:16: Executive Editor of Search Engline Land Chris Sherman, Danny Sullivan, and Norvig on stage.
12:17: Norvig built shopping tool that became Yahoo Shopping…author…philosopher…world record holder (palindrome)…various introductory tidbits about Norvig….rocket scientist…performer (nutcracker performance)
12:21: Peter to show interesting things Google Research is working on…
12:22: 15 minutes for this portion, he says…after the chile earthquake, they put out a person finder, already did one for Haiti. took an hour to customize and get it out…ties in with earthquake data, after shocks..
powermeter, built by someone else, google built software that you can put in your house and monitor power you use each day. people can cut consumption by 20%…
Earth Engine – app that works over earth data and shows deforestation/rain forest…overlay for Google Earth…comes up with differentiation to see where deforestation occurs…
Street View – where no car has gone before – trike, snowmobile, up in Whistler and other spots.
User photos in street view…
12:24 Image Swirl – image recognition software…made it into regular search product. click "similar images" on any images…based on color/shape
web scale imaage annotations – taken set of images and queries that trigger images and match them up…dolphin/translations of dolphin…all words match same types of pictures….
annoyed by captchas…theyr’e getting harder and people make more mistakes….using imag rotation…rather than type in words, we’ll show picture. you rotate it back to straight up. easy to do but difficult for computers…
Google Googles – app that runs on phone with camera. take a picture of landmark, business card (create contact), product, tells you what it is..
Video scene carving – you have a pic and you want to squish it down to smaller aspect ratio – done work where you can in one frame you can make things look better…
access to lots of data they’d like to share with academic community – jobs running in our clusters – here’s list of jobs. how much memory they took, etc.
app inventor for android – introductory program devleopjment environment to teach people how to program on phones….phones more personal than PCs for younger users….more excited about doing it for the phone…
12:28Speech recognition…
12:28 Translating phone. don’t have it yet, but have the pieces…translations up to 62 languages…voice is slower to roll out.. working on languages as they go….
Low-resource MT: Yiddish…not a lot of written text…the text that is is hard to identifiy…much is actually hebrew…can be hard to tell.
Punctuation/capitalization – in transcribed speech –
12:30: Sound understanding – not a product but is in development. you should be able to go to youtube and say you want something that soundsl like a rooster and get it..takes sonogram picture of sound overtime…applies image processing algorighms..turns sound into picture…
Google Squared – Building database of attrributes…
Clustering – analysis of words within a context…
attribute exraction – recognize cluster for basic foods…
12:31 Browser size…may be applicable for optimizing clients’ sites…puts overlay over page to see what percentage of people can see certain aspects of page.
A lot of this stuff has been talked about before (including here at WPN)…
12:32 In conclusion, quotes Yogi Berra "you can observe a lot by watching…" That’s what Google is trying to do…
12:33: Sherman says most people aware of Google’s 20% time policy…asks other ways they approach research.
Norvig says anybody can go out and start a project on their own…they can go to friends and get help working on it…when they’re not busy working on their own, unless they like your project better…then Google reviews it and sees how far it can go..infrastructure makes it easy for people to do experiments…when Google evaluates it, they’re not looking at powerpoint slides, they’re looking at a demo…
12:35 Demo will have a lot of ragged edges, but it will run at full scale…
12:36 At Google we want to build the tools so when you start out experimenting, you want to run it over the full web or the full dataset…
Some people are on loan to product groups…they jump in and out…some start out as research projects…PHDs running around with pagers all over the place trying to keep the system running and push it forward…
12:38: Sherman asks how he decides between short term/long term projects: Says pushing very hard toward doing something useful. the good theoretical things will be driven by that…
12:39: Danny asks whats the biggest things to come out of the 20% time: Says depends on who you ask…because it’s so informal, different people have different opinions….gmail/adsense?…..both came out of playing around, but creator doesn’t think they were 20% time because they became 100% time…but Norvig thinks the came OUT of 20% time…
12:41: Machine translation did internally, than recruited a team to develop it further…
12:42: Founder involvement? They want to understand what’s going on. 2 roles: setting long range direction and trying to evaluate as many projects as possible. they’re just as hands on as they’ve always been…to them, their life hasn’t changed very much…for the rest, life has changed, because the interval is a lot longer…
12:42 Danny asks if they have 20% projects…no…does Norvig? Yes he’s looking at education search…different than short term searching….
12:43: Research facilities all over world….you want to have experts in local languages/local cultures….get translations right. type of search products for different countries…culture….aspect of needing more engineers and they’re spread out…would like to have more people working in one place.
12:46 Real-time/social search – challenge to quickly understand new relevance signals, sherman asks…one thing that is still overhyped is PAGERANK…Google has made this clear in the past…Google never felt that it was such a big factor…always looked at all available data. Combining every available signal and try to figure out the best way to combine them. infrastructure makes it possible to do real-time….they’ve had an evolution with that…indexing used to be once a month (you remember the dance)….then they went to daily…then they went to hourly, and larry was really pushing back and saying hourly’s not good enough. the team said if you want it this year, it’s the best we can do…eventually he gave in…call it the 600 second index.
12:49: Is it time to change the name of PageRank? "I think that’s right," Norvig says. Says they need some better branding…
12:50 Caffeine: in one data center stilll…testing and it’d doing "very very well" will be rolling out soon to other data centers…doesn’t know exact timing….
12:51 Signals/ranking – any you could share besides links? in a lot of cases, you can say google’s manufacturing links (local seach)…we try to say ‘this is a business, this is referring to something else…" book scanning…books don’t have hyperlinks, but they do have bibliographies….build links that way…community interested in trying to find right keywords/synonyms, etc. trying to help do things like that…
Division between focus on core search vs. advertising vs other products? yes, fundamental distinction…don’t want them to interfere with each other…shouldn’t mix…share some of core tech…using google file system, table infrastucture, etc. teams are working separately trying to keep things distinct.
More work put into core search vs. ads? don’t know in terms of total numbers…different types of challenges…those are the real core…most effort of the company will be focused on those….other products peripheral…boundary of new stuff…
recognizing a business off of their name…more understanding of concept of "me" or of a "company"? starting to get there…getting closer…google squared, you can ask for companies and see CEO, headquarters, revenue, etc. still alittle buggy and will keep improving…if you want right answers. you do have to understand these things…want to support these types of queries…show me these companies and rank them by revenue. unless there’s a page that does that, you’re on your own right now. google wants to do this kind of thing with you. you tell them more about how you want it, and they’ll help you build that table…
speech recognition a hard problem 20 years ago, says sherman…solved now pretty much…what kind of problems like that are there now? norvig responds….vision is the big problem now. speech/translation now able to apply…they work well…you gather more data and experience…same standard model, but we’ve gotten better at it… if you improve a percent or two every year, you’re in good shape in 20 years…vision…and video images….a challenge…in terms of computational issues…so much more data involved in a video than in a text file…being able to push that through your system…and it’s just messier..trying to parse a picture up into objects and understand what’s doing what and how they’re moving and what that means….words are nice because you can differentiate them. it’s easier. the objects are harder.
1:01 Email overload solutions, asks Danny? -an intern last summer working on that…they do have some experimental things they’ll be rolling out…prioritization and tools for helping with that…watch for that. still playing with it. another thing – is email really the right tool? for me, one reason is email is bad, is i’m on all these legacy mailing lists i don’t really need to be on. maybe just slashing all of that down and starting over will help. if the new model is wave or buzz or twitter or something else, i don’t know what the right ansewr is, but sometimes starting over is the right step. I STILL USE EMAIL MORE he says (than wave)….
1:03: Teams using wave a lot? Yes, some are…people still trying to figure out where it works, and I think we’re at a confusing point…so many tools now. you have to start something new…just within the company…
Liveblogging ends…
Check out live.dev.webpronews.com for archived footage of the keynote, and coming up, WebProNews will have an exclusive interview with Peter Norvig.