Data Science
Data Science | News, how-tos, features, reviews, and videos
An introduction to time series forecasting
Time series forecasts are used to predict a future value or a classification at a particular point in time. Here’s a brief overview of their common uses and how they are developed.
6 essential Python tools for data science—now improved
SciPy, Numba, Cython, Dask, Vaex, and Intel SDC all have new versions that aid big data analytics and machine learning projects.
The real successes of AI
Despite the hype, especially around self-driving cars, AI is writing code, designing Google chip floor plans, and telling us how much to trust it.
Getting started with time series analysis
Time series analysis involves identifying attributes of your time series data, such as trend and seasonality, by measuring statistical properties.
How the cloud and big compute are remaking HPC
High-performance computing projects require massive quantities of compute resources. Pairing simulation and specialized hardware with the cloud powers the breakthroughs of the future.
How to visualize time series data
Visualizing time series data is often the first step in observing trends that can guide time series modeling and analysis.
Speed up your Python with Numba
Want faster number-crunching in Python? You can speed up your existing Python code with the Numba JIT, often with only one instruction.
Simplify machine learning with Azure Applied AI Services
Microsoft is wrapping its Cognitive Services machine learning platform as business-focused services.
Get started with Anaconda Python
Anaconda provides a handy GUI, a slew of work environments, and tools to simplify the process of using Python for data science.
Why you need a data integration platform
With every organization generating and accessing multiple data sources, an integration platform ensures every team has the data they need to drive the business forward.
Excel, Python, and the future of data science
If the ubiquitous spreadsheet program is the gateway to data science, Python aims to be the next step.
Use the new R pipe built into R 4.1
Learn the new pipe operator built into R 4.1 and how it differs from the maggritr pipe. Don’t want to install R 4.1 yet? See how to run R 4.1 in a Docker container.
The value of time series data and TSDBs
Time series data key insights in domains ranging from science and medicine to systems monitoring and industrial IoT. Understand time series data and the databases designed to ingest, store, and analyze time series data.
Dataiku review: Data science fit for the enterprise
Dataiku’s end-to-end machine learning platform combines visual tools, notebooks, and code to address the needs of data scientists, data engineers, business analysts, and AI consumers.
Review: 7 Python IDEs go to the mat
Which Python IDE is right for you? Here’s how IDLE, Komodo, LiClipse, PyCharm, Python extension for Visual Studio Code, Python Tools for Visual Studio, and Spyder stack up in capabilities and ease of use.
Python is devouring data science
Someone once said that Python’s data science training wheels would increasingly lead to the R language. Boy, was he wrong.
Data lineage: What it is and why it’s important
As your data evolves, you need a way to track the who, what, when, why, and how of those changes. You need a data lineage system.
How to send emails with graphics from R
See how to use the blastula package to send emails with text, graphs, and analysis right from R.
Data wrangling and exploratory data analysis explained
Data rarely comes in usable form. Data wrangling and exploratory data analysis are the difference between a good data science model and garbage in, garbage out.
What we just learned about data science — and what’s next
The past 12 months have revealed how valuable data science can be while also exposing its limitations. Expect big advances in the year to come.