How Google Works 摘抄 :: Lost Ferry
来源: BlogBus 原始链接: http://ferryslife.blogbus.com/logs/2006/07/2845621.html 存档链接: https://web.archive.org/web/20061109231754id_/http://ferryslife.blogbus.com/logs/2006/07/2845621.html
Lost Ferry 野渡无人舟自横 .: 发表评论 :. .: 最后更新 :. :: << 人以群分 | 首页 | 摘蓝莓 >> How Google Works 摘抄 http://www.baselinemag.com/article2/0,1540,1985040,00.asp "Would any of you be really proud to have this in your data center?" Merrill asks, pointing to the disorderly stack of servers connected by a tangle of cables. "But this is the start of the story," he adds, part of an approach that says "don't necessarily do it the way everyone else did. Just find some way of doing it cheap and effectively—so we can learn." Previous search engines had not analyzed links in such a systematic way. According to The Google Story, a book by Washington Post writer David Vise and Mark Malseed, Page had noticed that early search engine king AltaVista listed the number of links associated with a page in its search results but didn't seem to be making any other use of them. Page saw untapped potential. To cope with these demands, Page and Brin developed a virtual file system that treated the hard drives on multiple computers as one big pool of storage. They called it BigFiles. Rather than save a file to a particular computer, they would save it to BigFiles, which in turn would locate an available chunk of disk space on one of the computers in the server cluster and give the file to that computer to store, while keeping track of which files were stored on which computer. This was the start of what essentially became a distributed computing software infrastructure that runs on top of Linux. The idea is to "store data reliably even in the presence of unreliable machines," A GFS cluster consists of a master server and hundreds or thousands of "chunkservers," the computers that actually store the data. The master server contains all the metadata, including file names, sizes and locations. When an application requests a given file, the master server provides the addresses of the relevant chunkservers. The master also listens for a "heartbeat" from the chunkservers it manages—if the heartbeat stops, the master assigns another server to pick up the slack. Having studied Google's publications, he notes that the company has had to tinker with computer science fundamentals in a way that few enterprises would: "I mean, who writes their own file system these days?" For all the papers it has published, Google refuses to answer many questions. "We generally don't talk about our strategy ... because it's strategic," Page told Time magazine when interviewed for a Feb. 20 cover story. "PageRank is well known because Larry published it—well, they'll never do that again," Today, Google seems to have created a very effective "cult of secrecy," he says. "People I know go to Google, and I never hear from them again." Consider how Google handles project management. Every week, every Google technologist receives an automatically generated e-mail message asking, essentially, what did you do this week and what do you plan to do next week? This homegrown project management system parses the answer it gets back and extracts information to be used for follow-up. So, next week, Merrill explains, the system will ask, "Last week, you said you would do these six things. Did you get them done?" "What we're looking for here is lots of accidental cross-pollination," Merrill explains, so that employees in different offices, perhaps in different countries, can find out about other projects that might be relevant to their own work. Despite Google's reputation for secrecy toward outsiders, internally the watchword is "living out loud," Merrill says. "Everything we do is a 360-degree public discussion." ferryzhou 发表于 2006-07-16 11:50:14 | 引用(trackback0) | 编辑 □ 评论