Feature Engineering for Machine Learning

Feature Engineering for Machine Learning
Author :
Publisher : "O'Reilly Media, Inc."
Total Pages : 218
Release :
ISBN-10 : 9781491953198
ISBN-13 : 1491953195
Rating : 4/5 (98 Downloads)

Synopsis Feature Engineering for Machine Learning by : Alice Zheng

Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering. Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples. You’ll examine: Feature engineering for numeric data: filtering, binning, scaling, log transforms, and power transforms Natural text techniques: bag-of-words, n-grams, and phrase detection Frequency-based filtering and feature scaling for eliminating uninformative features Encoding techniques of categorical variables, including feature hashing and bin-counting Model-based feature engineering with principal component analysis The concept of model stacking, using k-means as a featurization technique Image feature extraction with manual and deep-learning techniques

The Art of Feature Engineering

The Art of Feature Engineering
Author :
Publisher : Cambridge University Press
Total Pages : 287
Release :
ISBN-10 : 9781108709385
ISBN-13 : 1108709389
Rating : 4/5 (85 Downloads)

Synopsis The Art of Feature Engineering by : Pablo Duboue

A practical guide for data scientists who want to improve the performance of any machine learning solution with feature engineering.

Feature Engineering and Selection

Feature Engineering and Selection
Author :
Publisher : CRC Press
Total Pages : 266
Release :
ISBN-10 : 9781351609463
ISBN-13 : 1351609467
Rating : 4/5 (63 Downloads)

Synopsis Feature Engineering and Selection by : Max Kuhn

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Python Feature Engineering Cookbook

Python Feature Engineering Cookbook
Author :
Publisher : Packt Publishing Ltd
Total Pages : 364
Release :
ISBN-10 : 9781789807820
ISBN-13 : 1789807824
Rating : 4/5 (20 Downloads)

Synopsis Python Feature Engineering Cookbook by : Soledad Galli

Extract accurate information from data to train and improve machine learning models using NumPy, SciPy, pandas, and scikit-learn libraries Key FeaturesDiscover solutions for feature generation, feature extraction, and feature selectionUncover the end-to-end feature engineering process across continuous, discrete, and unstructured datasetsImplement modern feature extraction techniques using Python's pandas, scikit-learn, SciPy and NumPy librariesBook Description Feature engineering is invaluable for developing and enriching your machine learning models. In this cookbook, you will work with the best tools to streamline your feature engineering pipelines and techniques and simplify and improve the quality of your code. Using Python libraries such as pandas, scikit-learn, Featuretools, and Feature-engine, you’ll learn how to work with both continuous and discrete datasets and be able to transform features from unstructured datasets. You will develop the skills necessary to select the best features as well as the most suitable extraction techniques. This book will cover Python recipes that will help you automate feature engineering to simplify complex processes. You’ll also get to grips with different feature engineering strategies, such as the box-cox transform, power transform, and log transform across machine learning, reinforcement learning, and natural language processing (NLP) domains. By the end of this book, you’ll have discovered tips and practical solutions to all of your feature engineering problems. What you will learnSimplify your feature engineering pipelines with powerful Python packagesGet to grips with imputing missing valuesEncode categorical variables with a wide set of techniquesExtract insights from text quickly and effortlesslyDevelop features from transactional data and time series dataDerive new features by combining existing variablesUnderstand how to transform, discretize, and scale your variablesCreate informative variables from date and timeWho this book is for This book is for machine learning professionals, AI engineers, data scientists, and NLP and reinforcement learning engineers who want to optimize and enrich their machine learning models with the best features. Knowledge of machine learning and Python coding will assist you with understanding the concepts covered in this book.

Python Data Science Handbook

Python Data Science Handbook
Author :
Publisher : "O'Reilly Media, Inc."
Total Pages : 609
Release :
ISBN-10 : 9781491912133
ISBN-13 : 1491912138
Rating : 4/5 (33 Downloads)

Synopsis Python Data Science Handbook by : Jake VanderPlas

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools. Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python. With this handbook, you’ll learn how to use: IPython and Jupyter: provide computational environments for data scientists using Python NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python Matplotlib: includes capabilities for a flexible range of data visualizations in Python Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms

Feature Engineering Made Easy

Feature Engineering Made Easy
Author :
Publisher : Packt Publishing Ltd
Total Pages : 310
Release :
ISBN-10 : 9781787286474
ISBN-13 : 1787286479
Rating : 4/5 (74 Downloads)

Synopsis Feature Engineering Made Easy by : Sinan Ozdemir

A perfect guide to speed up the predicting power of machine learning algorithms Key Features Design, discover, and create dynamic, efficient features for your machine learning application Understand your data in-depth and derive astonishing data insights with the help of this Guide Grasp powerful feature-engineering techniques and build machine learning systems Book Description Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization. What you will learn Identify and leverage different feature types Clean features in data to improve predictive power Understand why and how to perform feature selection, and model error analysis Leverage domain knowledge to construct new features Deliver features based on mathematical insights Use machine-learning algorithms to construct features Master feature engineering and optimization Harness feature engineering for real world applications through a structured case study Who this book is for If you are a data science professional or a machine learning engineer looking to strengthen your predictive analytics model, then this book is a perfect guide for you. Some basic understanding of the machine learning concepts and Python scripting would be enough to get started with this book.

Introduction to Machine Learning with Python

Introduction to Machine Learning with Python
Author :
Publisher : "O'Reilly Media, Inc."
Total Pages : 429
Release :
ISBN-10 : 9781449369897
ISBN-13 : 1449369898
Rating : 4/5 (97 Downloads)

Synopsis Introduction to Machine Learning with Python by : Andreas C. Müller

Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination. You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book. With this book, you’ll learn: Fundamental concepts and applications of machine learning Advantages and shortcomings of widely used machine learning algorithms How to represent data processed by machine learning, including which data aspects to focus on Advanced methods for model evaluation and parameter tuning The concept of pipelines for chaining models and encapsulating your workflow Methods for working with text data, including text-specific processing techniques Suggestions for improving your machine learning and data science skills

Practical Automated Machine Learning on Azure

Practical Automated Machine Learning on Azure
Author :
Publisher : "O'Reilly Media, Inc."
Total Pages : 190
Release :
ISBN-10 : 9781492055549
ISBN-13 : 1492055549
Rating : 4/5 (49 Downloads)

Synopsis Practical Automated Machine Learning on Azure by : Deepak Mukunthu

Develop smart applications without spending days and weeks building machine-learning models. With this practical book, you’ll learn how to apply automated machine learning (AutoML), a process that uses machine learning to help people build machine learning models. Deepak Mukunthu, Parashar Shah, and Wee Hyong Tok provide a mix of technical depth, hands-on examples, and case studies that show how customers are solving real-world problems with this technology. Building machine-learning models is an iterative and time-consuming process. Even those who know how to create ML models may be limited in how much they can explore. Once you complete this book, you’ll understand how to apply AutoML to your data right away. Learn how companies in different industries are benefiting from AutoML Get started with AutoML using Azure Explore aspects such as algorithm selection, auto featurization, and hyperparameter tuning Understand how data analysts, BI professions, developers can use AutoML in their familiar tools and experiences Learn how to get started using AutoML for use cases including classification, regression, and forecasting.

Feature Learning and Understanding

Feature Learning and Understanding
Author :
Publisher : Springer Nature
Total Pages : 299
Release :
ISBN-10 : 9783030407940
ISBN-13 : 3030407942
Rating : 4/5 (40 Downloads)

Synopsis Feature Learning and Understanding by : Haitao Zhao

This book covers the essential concepts and strategies within traditional and cutting-edge feature learning methods thru both theoretical analysis and case studies. Good features give good models and it is usually not classifiers but features that determine the effectiveness of a model. In this book, readers can find not only traditional feature learning methods, such as principal component analysis, linear discriminant analysis, and geometrical-structure-based methods, but also advanced feature learning methods, such as sparse learning, low-rank decomposition, tensor-based feature extraction, and deep-learning-based feature learning. Each feature learning method has its own dedicated chapter that explains how it is theoretically derived and shows how it is implemented for real-world applications. Detailed illustrated figures are included for better understanding. This book can be used by students, researchers, and engineers looking for a reference guide for popular methods of feature learning and machine intelligence.

Data Science Revealed

Data Science Revealed
Author :
Publisher : Apress
Total Pages : 252
Release :
ISBN-10 : 1484268695
ISBN-13 : 9781484268698
Rating : 4/5 (95 Downloads)

Synopsis Data Science Revealed by : Tshepo Chris Nokeri

Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, and deep learning. This book teaches you how to select variables, optimize hyper parameters, develop pipelines, and train, test, and validate machine and deep learning models. Each chapter includes a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model. The book covers parametric methods or linear models that combat under- or over-fitting using techniques such as Lasso and Ridge. It includes complex regression analysis with time series smoothing, decomposition, and forecasting. It takes a fresh look at non-parametric models for binary classification (logistic regression analysis) and ensemble methods such as decision trees, support vector machines, and naive Bayes. It covers the most popular non-parametric method for time-event data (the Kaplan-Meier estimator). It also covers ways of solving classification problems using artificial neural networks such as restricted Boltzmann machines, multi-layer perceptrons, and deep belief networks. The book discusses unsupervised learning clustering techniques such as the K-means method, agglomerative and Dbscan approaches, and dimension reduction techniques such as Feature Importance, Principal Component Analysis, and Linear Discriminant Analysis. And it introduces driverless artificial intelligence using H2O. After reading this book, you will be able to develop, test, validate, and optimize statistical machine learning and deep learning models, and engineer, visualize, and interpret sets of data. What You Will Learn Design, develop, train, and validate machine learning and deep learning models Find optimal hyper parameters for superior model performance Improve model performance using techniques such as dimension reduction and regularization Extract meaningful insights for decision making using data visualization Who This Book Is For Beginning and intermediate level data scientists and machine learning engineers