Data lake upstart Upsolver takes aim at Databricks

The San Francisco-based startup has released a SQL-based, self-orchestrating data pipeline platform, claiming it will go to go toe-to-toe with Databricks’ Delta Live Tables.

Book review: 'Python Tools for Scientists'

Python has a wealth of scientific computing tools, so how do you decide which ones are right for you? This book cuts through the noise to help you deliver results.

5 modelops capabilities that boost data science productivity

Organizations are hiring data scientists to develop ML models and experiment with AI, but the business impact is lagging for many large enterprises.

Data visualization with Observable JavaScript

Learn how to make the most of Observable JavaScript and the Observable Plot library, including a step-by-step guide to eight basic data visualization tasks in Plot.

Learn Observable JavaScript with Observable notebooks

Free, hosted Observable notebooks provide an interactive experience and lots of free, open-source Observable JS code you can reuse and learn from. Here's how to get started.

A beginner's guide to using Observable JavaScript, R, and Python with Quarto

Using Quarto with Observable JavaScript is a great solution for R and Python users who want to create more interactive and visually engaging reports.

How to choose a cloud machine learning platform

12 capabilities every cloud machine learning platform should provide to support the complete machine learning lifecycle—and which cloud machine learning platforms provide them.

The importance of monitoring machine learning models

Changing assumptions and ever-changing data mean the work doesn’t end after deploying machine learning models to production. These best practices keep complex models reliable.

MIT startup DataCebo offers tool to evaluate synthetic data

Synthetic Data Metrics is an open-source Python library for evaluating model-agnostic tabular data by pitching machine generated data sets against real data sets.

When is enough data enough?

Maybe we don’t need more data, we just need people who understand the data we already have and its value in a business context.

Use Cython to accelerate array iteration in NumPy

NumPy is known for being fast, but there's always room for improvement. Learn how to use Cython to iterate over NumPy arrays at the speed of C.

IT career roadmap: Data scientist

Reading Freakonomics awakened his passion for data science. Here's how further education and thoughtful career moves led to becoming a data scientist.

RStudio changes name to Posit, expands focus to include Python and VS Code

RStudio is updating its name as it aims to expand use of its commercial products among data science teams using both Python and R.


3 data quality metrics dataops should prioritize

Data-driven decisions require data that is trustworthy, available, and timely. Upping the dataops game is a worthwhile way to offer business leaders reliable insights.

How to attend RStudio Conference 2022 remotely for free

Keynotes and presentations will be streamed live. Plus, there will be a Discord server for virtual attendees.

Why do businesses suck at using data?

Few enterprises can effectively leverage their data inside or outside of the cloud, and a new study says that's still the case. It's time to make a plan.

What is behavioral analytics and when is it important?

The ability to mine large amounts of data to study how users act offers long-reaching business benefits and risk reduction opportunities.

What is TensorFlow? The machine learning library explained

TensorFlow is a Python-friendly open source library for numerical computation that makes machine learning and developing neural networks faster and easier.

