Sunday 19 January 2014

[E150.Ebook] Ebook Practical Data Science with R, by Nina Zumel, John Mount

Ebook Practical Data Science with R, by Nina Zumel, John Mount

How if your day is begun by checking out a book Practical Data Science With R, By Nina Zumel, John Mount However, it remains in your gizmo? Everyone will certainly constantly touch and us their gizmo when awakening and also in early morning activities. This is why, we intend you to also check out a book Practical Data Science With R, By Nina Zumel, John Mount If you still confused how you can get the book for your gadget, you could follow the method below. As right here, we offer Practical Data Science With R, By Nina Zumel, John Mount in this web site.

Practical Data Science with R, by Nina Zumel, John Mount

Practical Data Science with R, by Nina Zumel, John Mount



Practical Data Science with R, by Nina Zumel, John Mount

Ebook Practical Data Science with R, by Nina Zumel, John Mount

Book lovers, when you require a brand-new book to check out, locate guide Practical Data Science With R, By Nina Zumel, John Mount below. Never fret not to discover just what you need. Is the Practical Data Science With R, By Nina Zumel, John Mount your required book currently? That's true; you are really a great viewers. This is a perfect book Practical Data Science With R, By Nina Zumel, John Mount that originates from great author to share with you. Guide Practical Data Science With R, By Nina Zumel, John Mount offers the very best encounter and also lesson to take, not only take, yet also find out.

Also the rate of a book Practical Data Science With R, By Nina Zumel, John Mount is so economical; lots of people are truly stingy to set aside their cash to purchase guides. The other reasons are that they feel bad and have no time at all to visit guide shop to search guide Practical Data Science With R, By Nina Zumel, John Mount to check out. Well, this is modern period; many e-books can be obtained easily. As this Practical Data Science With R, By Nina Zumel, John Mount and more books, they can be entered very quick means. You will not have to go outside to obtain this book Practical Data Science With R, By Nina Zumel, John Mount

By seeing this web page, you have done the right staring factor. This is your begin to select the book Practical Data Science With R, By Nina Zumel, John Mount that you want. There are lots of referred e-books to review. When you would like to obtain this Practical Data Science With R, By Nina Zumel, John Mount as your e-book reading, you can click the web link web page to download and install Practical Data Science With R, By Nina Zumel, John Mount In few time, you have owned your referred e-books as your own.

As a result of this publication Practical Data Science With R, By Nina Zumel, John Mount is marketed by on-line, it will ease you not to publish it. you can get the soft data of this Practical Data Science With R, By Nina Zumel, John Mount to conserve in your computer, kitchen appliance, and also a lot more devices. It relies on your desire where and also where you will review Practical Data Science With R, By Nina Zumel, John Mount One that you should consistently remember is that reading book Practical Data Science With R, By Nina Zumel, John Mount will certainly never ever finish. You will certainly have going to check out various other publication after finishing a book, as well as it's constantly.

Practical Data Science with R, by Nina Zumel, John Mount

Summary

Practical Data Science with R lives up to its name. It explains basic principles without the theoretical mumbo-jumbo and jumps right to the real use cases you'll face as you collect, curate, and analyze the data crucial to the success of your business. You'll apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Book

Business analysts and developers are increasingly collecting, curating, analyzing, and reporting on crucial business data. The R language and its associated tools provide a straightforward way to tackle day-to-day data science tasks without a lot of academic theory or advanced mathematics.

Practical Data Science with R shows you how to apply the R programming language and useful statistical techniques to everyday business situations. Using examples from marketing, business intelligence, and decision support, it shows you how to design experiments (such as A/B tests), build predictive models, and present results to audiences of all levels.

This book is accessible to readers without a background in data science. Some familiarity with basic statistics, R, or another scripting language is assumed.

What's Inside

  • Data science for the business professional
  • Statistical analysis using the R language
  • Project lifecycle, from planning to delivery
  • Numerous instantly familiar use cases
  • Keys to effective data presentations

About the Authors

Nina Zumel and John Mount are cofounders of a San Francisco-based data science consulting firm. Both hold PhDs from Carnegie Mellon and blog on statistics, probability, and computer science at win-vector.com.

Table of Contents

PART 1 INTRODUCTION TO DATA SCIENCE
  • The data science process
  • Loading data into R
  • Exploring data
  • Managing data
  • PART 2 MODELING METHODS
  • Choosing and evaluating models
  • Memorization methods
  • Linear and logistic regression
  • Unsupervised methods
  • Exploring advanced methods
  • PART 3 DELIVERING RESULTS
  • Documentation and deployment
  • Producing effective presentations
    • Sales Rank: #69530 in Books
    • Published on: 2014-04-13
    • Original language: English
    • Number of items: 1
    • Dimensions: 9.10" h x 1.00" w x 7.30" l, 1.53 pounds
    • Binding: Paperback
    • 389 pages

    About the Author

    Nina Zumel co-founded Win-Vector, a data science consulting firm in San Francisco. She holds a PH.D. in robotics from Carnegie Mellon and was a content developer for EMC's Data Science and Big Data Analytics Training Course. Nina also contributes to the Win-Vector Blog, which covers topics in statistics, probability, computer science, mathematics and optimization.

    John Mount co-founded Win-Vector, a data science consulting firm in San Francisco. He has a Ph.D. in computer science from Carnegie Mellon and over 15 years of applied experience in biotech research, online advertising, price optimization and finance. He contributes to the Win-Vector Blog, which covers topics in statistics, probability, computer science, mathematics and optimization.

    Most helpful customer reviews

    90 of 99 people found the following review helpful.
    Lost in the middle
    By Dimitri Shvorob
    A problem with the other reviews is that they consider the book in isolation, as if no alternatives were available. "Practical data science" is not the only machine-learning-lite book on the market: Manning itself had published Harrington's Python-based "Machine learning in action", Packt offers "Machine learning with R" by Lantz, O'Reilly boasts "Doing data science" by Schutt and O'Neil, and, finally, Springer has "Introduction to statistical learning" by James, Witten, Hastie and Tibshirani. I have seen and reviewed all except Harrington's; for the purposes of this review, I'll ultra-briefly describe each contender ("Machine learning with R" - thin, average-quality, superficial, but effective at what it sets out to achieve; "Doing data science" - a mash-up of a textbook and a magazine article about kewl data scientists; below-average quality, but a lot of pop appeal; "Introduction to statistical learning" - high-quality, accessible and visually appealing textbook with R illustrations) and get to "Practical data science" - which, to me, comes across as a better-organized, earnest version of "Doing data science". The book's forte is its effort to go beyond a catalogue of R-illustrated machine-learning methods - and you have to have seen similar books to know how standard this repertoire is - and discuss practical skills useful to a budding "data scientist", from version control to presenting. I appreciate this effort, but feel that this content was not sufficiently substantial or polished to develop into a "unique selling proposition" of the kind that each of its competitors has - hence the title of my review.

    UPD. With the benefit of a little more life experience, I would say: don't spend your time on *any* R book. Python is the way to go.

    29 of 30 people found the following review helpful.
    Effective starting point for your data science project
    By Christopher G. Loverich
    tl;dr: A well rounded, occasionally high-level introductory text that will leave you feeling prepared to participate in the Data Science conversation at work, from earliest planning to presentation and maintenance.

    Details:

    Was excited to see this book coming to publication. I'm a fan of practical, non-academic approaches to subjects and prefer working from concrete examples to abstract principles (rather than the other way around). I think this is both the most difficult and most needed type of resources that can be put into print. This book handles the task ok; it falls a bit short on practical, concrete, use cases as it alternates between working with hands on datasets and shotgun coverage of principles and techniques at a higher level. I'd have much preferred sticking with single data-sets for longer (say, a couple chapters per data set), but didn't feel cheated out of hands on work.

    Pros:
    - Easy access to the datasets via Github; good documentation on where to find others
    - Key Takeaways provided at end of chapter are good summaries of overall information provided.
    - A good focus on not just data analysis, but the process as a whole; very Agile like, practical, and non-dogmatic.
    - Battle tested advice: You can tell some of the advice comes from hard-fought battles - ex: Why not use the sample() function instead of manually creating a sample column? Because with a sample column, you can repeatably sample the same data (e.g. all columns < 2) for repeatable output and for regression testing (avoiding introducing bugs).
    - Builds your analyst vocabulary, increasing your all-important google-fu skills. Not knowing what to Google is, imho, the single hardest problem when learning a new set of problems / api's.
    - Good use of Appendices for introducing R syntax / installation, rather then stuffing it into one of the early chapters.

    Cons:
    - Doesn't stick with data sets long enough. I went to the trouble of setting up a true database to use the first dataset (chapter 2); only to move on to a different data set in the very next chapter (book did eventually return to the data set).
    - Feels a bit back and forth at times on whether it wants to be a truly pragmatic, focused work or a principles driven, broadly scoped book (thinking of chapters 5-7 here). Not necessarily a knock depending on what your looking for.

    I've ready a few books on getting started in data analysis, R, statistics, etc. This book is solid enough that were I to choose among them, I'd recommend it first. I think if the book focused down on using data-sets for longer stretches, allowing you to learn the data well and apply multiple types of analyses on top of it (especially earlier on), it would be a bit more engaging.

    Lastly, its has good coverage of R principles but (per its scope) doesn't get into the nitty gritty. I'd recommend "The Art of R Programming" for that, which would be a good companion to this book (e.g. covers R but not Data Analysis). I've heard R in Action is good as well, though haven't read it. Caveat emptor.

    Disclaimer: I received a e-copy of the book from Manning for review.

    47 of 53 people found the following review helpful.
    Good intro to data science
    By Scott C. Locklin
    I've had to hire recent graduates with degrees in machine learning, operations research and even "data science." One of the problems with such people: they don't know anything practical. They probably know the basics of regression and some classification routines, as learned in their coursework. They've probably worked on one or many data science like problems, using machine learning techniques or regression or what not. Many of them have never done a SQL query, or done the dirty business of data cleaning which takes up most of the data scientist's time. They'll always have gaps in their education; maybe they wrote a dissertation on an application of trees or deep learning, and have never used any of the other myriad tools available to the data scientist. None of them have ever done data science for money, and so none of them know about practical things like git or what the process looks like in an industrial setting. It is for these people that this book appears to be written. In an ideal world, all larval data scientists would be taught a course based on this book, or at least go through it themselves. It is also useful to experienced practitioners, as it covers many things, and can be a good practical reference to keep around. The book is ordered as a data science project would be ordered, from start to finish; so, as you proceed down an engagement, reviewing the chapters in order will be helpful.

    Ch1 describes the job of the data scientist, the workflow, and the characters you run into on a project.
    Ch2 outlines some of the tools used to get at the data, including the authors tool, "SQL Screwdriver." I'd have liked some genuflections at the unix tools used to clean data before it is put anywhere important; sed, awk, tr, sort and cut here, but I'm not sure if there is a graceful way of doing this. Or perhaps I'm the only weirdo who uses these in the ETL process.
    Ch3 exploring data; using the various plot utilities in ggplot2 (the graphics library everyone should be using); bar charts, histograms, summary statistics and scatter plots.
    Ch4 managing data: what they call "cleaning data" -I call reshaping data (and I use reshape, sometimes anyway; I would have mentioned this, though I got on well without it for years)
    Ch5 gets into specifying the problem; is it a classification problem? scoring? recommendation engine? How do I quantify success? This chapter is very helpful in doing this. Of course, problems evolve over time, and customers change their minds, but there are very helpful mappings here which will point you in the right direction There are a few new techniques which should probably be included in future editions of this chapter, depending on how they pan out: I'm impressed with using drop out techniques to prevent overfitting, for example (this is bleeding edge stuff, generally in context of deep learning).
    Ch6 Memorization techniques covers Naive Bayes, KNN and decision trees. It would have been nice to have more information on the various kinds of variable selection techniques (particularly important for NB and KNN), but mentioning this will allow the practitioner to go find their own information.
    Ch7 Logistic and Linear regression: most would have done these first, but these are actually more complex than memorization techniques, and there are more things to know to keep the practitioner out of trouble. In my opinion, this chapter really shines: everyone who is going to do this for a living has had some exposure to regression models: this chapter makes it practical.
    Ch8 Unsupervised methods; covers clustering; heirarchical clustering (one of the most useful tricks you will use in data science), kmeans (it has to be done, though I never found it to be useful) and association rules.
    Ch9 Advanced methods: GAMs, SVM, bagging and random forests (the importance measure trick: if you don't know it, pay attention: this is a very good trick). These are the "industrial strength" tools used in industry. I, personally would have stuck GAMs in their own chapter, and mentioned boosting here, but everyone is a little different in their tastes.
    Ch10 Documentation and deployment: they use Knitr; I just use vanilla Sweave (I've tried brew, but never took to it). They introduce git here: something I would have done in chapter 1 or 2, but it is a fairly natural place to mention it. They use the Rook tool to deploy HTTP services; I've never used it, though I have used Shiny, which I can recommend. They mention PMML briefly (I've never used it).

    The appendix on R is helpful, though it doesn't include the most valuable advice of all for using R in production: you need to maintain a distribution of R and all used packages, as well as a dependency toolchain if the code will be deployed on multiple servers.

    See all 31 customer reviews...

    Practical Data Science with R, by Nina Zumel, John Mount PDF
    Practical Data Science with R, by Nina Zumel, John Mount EPub
    Practical Data Science with R, by Nina Zumel, John Mount Doc
    Practical Data Science with R, by Nina Zumel, John Mount iBooks
    Practical Data Science with R, by Nina Zumel, John Mount rtf
    Practical Data Science with R, by Nina Zumel, John Mount Mobipocket
    Practical Data Science with R, by Nina Zumel, John Mount Kindle

    Practical Data Science with R, by Nina Zumel, John Mount PDF

    Practical Data Science with R, by Nina Zumel, John Mount PDF

    Practical Data Science with R, by Nina Zumel, John Mount PDF
    Practical Data Science with R, by Nina Zumel, John Mount PDF

    No comments:

    Post a Comment