Lei feng's network (search for "Lei feng's network", public interest): the author Zhu 赟, Airbnb senior beauty Cheng Xuyuan (tips: Avatar is a real person).
(Photo: Emily Cheng hand-painted, Airbnb Cheng Xuyuan, my neighbor)
Most companies in Silicon Valley, there is a lot of Data Scientist, referred to as DS, don't know if domestic call data scientist.
InfoQ a few days ago published an article on the Web site of the Airbnb payment platform to anomaly detection, translation is a piece of the original text in English on the Airbnb blog.
This article is about an anomaly detection system of Airbnb: Disney phone case
Airbnb for 190 countries around the world to provide service, support for multiple currency types. In most cases, payment systems will succeed to pay, but sometimes pause fault occurs, for example, some currencies cannot be processed or paid channels is not accessible. Fault information in order to be able to catch them as soon as possible, the company's data team to develop a real-time anomaly detection system to identify these issues. This anomaly detection system can help the product team orientation, also allows data analysts to make more time to do other work, for example, new ways of paying for products or online a/b testing, pricing or price forecasts and machine learning model is constructed to make a personal recommendation.
Note: InfoQ eagerly as an IT media, in the standard English translation on the process of blog a very good example. First of all, as far as the accuracy of the translation, InfoQ site on the overall quality of the translation is very high, there is someone to proofread. In addition, I know contact the authorized translations of articles for them, say, this one also got Aribnb Eng and PR sectors approval. This InfoQ personal little tribute to respect for intellectual property rights.
This anomaly detection system, we are still in use, but there will be further improved and perfected. English text was written by a Data Scientist in our group Lu Jingxiao. After the release, also had a few friends privately asked some details, such as: "How did such a system from scratch to make it? "" What people need to do? "And so on. So I talk about company today Data Scientist skills requirements and some of the work is about.
By the way, Twitter had a similar anomaly detection system based on the open source r language packs. And the main difference for seasonality. Airbnb FFT model used to simulate the seasonality may be more flexible. Are interested can download the Twitter open source packages, and then do a similar system according to their needs. Our system is not open source.
| Data what do Scientist do?
Internet companies in General, DS may include (more than) the following four categories:
Build a Dashboard. In some companies is Business Intelligence (BI) to do. Mostly use statistical tools and drawing tools to create a Dashboard, key metrics and information at a glance. And complex link between data in the most intuitive way, to show other employees of the company. Including PM, Manager and other staff for each level and have a more accurate understanding of information in all directions.
And collecting and cleaning data data engineer, build a data pipe. This includes using a variety of scripting languages (such as Python), write some program, and get the data you need, and some of the data processing.
Direction of machine learning data scientists and machine learning software engineers build a machine learning model. And analysis with the results of the study, and participate in the adjustment of parameters and model.
A variety of data analysis. Including the statistical analysis of the results of a/b testing. A/b test may be in a lot of companies using machine learning and about wide. Simple and effective, especially for decisions about product features or UI for user selection. Method is very simple, the design of a product of two or more random push equally to different groups of users, according to analysis of user feedback data, quickly and efficiently determine which design is better. Or what kind of design for different people, and the scene even better. These results and sometimes even contrary to the intuition with the design-time. But when the sampling scope and data accumulated up to a certain extent, provided the results are quite compelling.
| Data Scientist has the kind of academic background?
Mostly DS comes from mathematics and statistics. However there is a lot from physics, mechanical engineering, finance, and other professionals. In contrast, DS is higher than average educational level of the code, a large proportion of which is master Dr.
Know daily how to become a data scientist? A mention in the article:
Scientific data (Data Science) is the study of extracting knowledge from data, key is a science. Data field of science integrates several different elements, including signal processing, mathematical, probabilistic modelling techniques and theory, machine learning, computer programming, statistics, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehouses, and to extract rules from data and products of high performance computing. Scientific data is not limited to data, but true, the expansion of the amount of data makes the data more important role science.
Data Science practitioners are known as data scientists. Data scientists profound professional knowledge in some scientific disciplines to solve complex data problems. Not too distant future, data scientists need to be proficient in one, two or even more discipline, using mathematics, statistics, and computer science, factors of production work. Data scientists as a team.
Have invested in venture capital company Facebook,LinkedIn geleiluoke data scientist described as "management and insight into data". On the IBM Web site, the data scientist role is described as "half of the analysts, half artist." They represent an evolution of the business or the part of data analysis.
| How is Data of a good Scientist?
In addition to financial areas such as DS technical background with the stringent requirements, many Internet companies such as Square, Airbnb, Facebook, etc after you reach a certain level of technical background, more attention is also soft skills, such as: Disney iPhone 6 Plus Case
The sensitivity of the data. Hidden information in data can be found and verified by means of modeling.
Ability to communicate and a variety of non-DS. Data Scientest is usually assigned to a group, need to work closely with product managers, engineers and other cooperation. Information passed between the coordination group, a data-driven test implementation, are an excellent quality of DS.
The Visual representation of data. Know how to choose the most efficient way, clear and accurate representation of information out of the data.
Core Metrics and data dependencies between, relevance can make an accurate analysis. This can contribute to more beneficial to improve Metrics programs.
Which is why when many Silicon Valley companies in DS, DS prefers to recruit work experience, many small and medium sized companies or even simply not to hire graduates. Because the various data generated dashboard, but company executives make decisions based primarily on.
| Data How do Scientist salary in Silicon Valley?
The problem I don't have enough data, and not biased to mislead. However it makes me think of another thing.
We often see LinkedIn, GlassDoor and other Web site statistics for each career average. I and some of my friends think that these data are very bias, and the base can be said to be low. Why do you say it? I guess there are two reasons. First, just entering the world of people who prefer to participate in the survey. And more Senior people, in fact, almost no one or very few people to participate in the survey. Second, many high-wage companies participating in the survey relatively few people. Why did I guess? Just from what I know about the statistics that is lower than the actual situation speculative. Don't argue with me, I have only stated my opinion, believe it or not.
| Why is Data Scientist important?
Nothing to say: a trusted, reliable interpretation of the data, is an important basis for judgment.
For many companies, recruiting outstanding Data Scientest, in fact, as important as hiring talented software engineer.
The extent to which a company's data-driven, look at their Data proportion of Scientist and engineers, also are likely to have a spectrum.
Lei Feng network Note: drawings from the Ju.OutOfMemory.CN. Reprint this article please contact the authorized and complete information is retained, indicate the source and author shall not modify the article.
1468 votes
Red Note 3
Red Note 3 this new paragraph, rice stressed that this is their first full metal phone, of course it is also owned by millet's first to provide fingerprint identification phone. Red Note 3 gray, silver and gold three optional, millet official stressed that the use of new technology, so that its metal shell fit right in, and in order to guarantee great texture, designed to use the arc-shaped edge, 120th double blast of sand material.
View details of the voting >>