Analytics
Analytics | News, how-tos, features, reviews, and videos
Generative AI adoption speed unprecedented, O’Reilly survey says
Survey of enterprise users of generative AI finds rapid adoption but also hurdles, with difficulty finding business use cases, legal uncertainties, and high infrastructure costs top concerns.
How Apache Arrow accelerates InfluxDB
The Apache Arrow in-memory columnar format has become a critical component of many analytical database systems and tools. It brings a number of advantages to InfluxDB.
How generative AI changes the data journey
By making cryptic machine data human readable, generative AI will dramatically reduce the time and energy IT teams spend on managing and interpreting data generated by operational systems.
How to apply design thinking in data science
Design thinking is critical for developing data-driven business tools that surpass end-user expectations. Here's how to apply the five stages of design thinking in your data science projects.
The best Python libraries for parallel processing
Do you need to distribute a heavy Python workload across multiple CPUs or a compute cluster? Here are seven frameworks up to the task.
How Apache Arrow speeds big data processing
Apache Arrow defines an in-memory columnar data format that accelerates processing on modern CPU and GPU hardware, and enables lightning-fast data access between systems.
Python Pandas creator Wes McKinney joins Posit
Python pandas creator Wes McKinney has joined data science company Posit as a principal architect, signaling the company's efforts to play a bigger role in the Python universe as well as the R ecosystem.
Snowflake to add developer tools to Snowpark, plans cost management feature
The cost management feature, which is still in private preview, is expected to help enterprises optimize their expenditure on Snowflake.
Snowflake’s Cortex to bring generative AI to its Data Cloud platform
Cortex is designed to help streamline the development of data-driven applications, use cases, AI and ML models, and foundation models from Snowpark.
Apache Flink 101: A guide for developers
The de facto standard for real-time stream processing is sometimes described as being complex and difficult to learn. Start by understanding these core principles.
Transforming spatiotemporal data analysis with GPUs and generative AI
As GPU-accelerated databases bring new levels of performance and precision to time-series and spatial workloads, generative AI puts complex analysis within reach of non-experts.
The best open source software of 2023
InfoWorld’s 2023 Bossie Awards recognize the year’s leading open source tools for software development, data management, analytics, AI, and machine learning.
How to have encryption, computation, and compliance all at once
Baffle Advanced Encryption was designed to overcome the barriers to adopting encryption for analytics. Here’s how it enables compliant, privacy-enhanced computation.
What happened to edge computing?
Edge computing offers less latency and bandwidth savings, but the lack of standards and problems with interoperability and security still need to improve.
BI meets data science in Microsoft Fabric
Microsoft’s cloud-hosted data lake and lakehouse platform gains new data science tools and opens up Power BI datasets to Python, R, and SparkSQL.
Review: 7 Python IDEs compared
What's the best IDE for Python? Here's how IDLE, Komodo, PyCharm, PyDev, Microsoft's Python and Python Tools extensions for Visual Studio Code, and Spyder stack up.
How to size and scale Apache Kafka, without tears
The first step to sizing or scaling Kafka for optimal cost and performance is understanding how the data streaming platform uses resources. Here’s a primer.
The future of data transformation is collaborative
Data-driven decision-making suffers from a mismatch between the tools, skills, and understanding of IT and data consumers in most enterprises. Here’s how to bridge the gap.