Background & Thesis
The Big Data Market size was valued at around USD 162.6 billion in 2021 and is projected to grow to around 273.4 USD billion (or more, depending on the source) by 2026, at a Compound Annual Growth Rate (CAGR) of around 11% - 13% during the forecast period. A sharp increase in data volume, AI and ML frameworks, and breakthroughs drives this industry. The rise in data connectivity through cloud computing and the incorporation of digital transformation in top-level strategies only increases yearly, which drives new needs for innovation at the data layer stack.
By 2023, analytics adoption by enterprises will increase 50%, driven by vertical-specific and domain-specific augmented analytics solutions.
By 2025, 90% of all enterprise software buying decisions will occur outside the IT organization.
According to Gartner, Data management software consists of five submarkets. They do not include the database management system (DBMS) market in this as it is a much larger market ($80.3 billion in 2021) that needs to be analyzed separately.
It is evident that the overall Data Platform market is the highest-growing space in the IT industry, way above the Cyber Security market, for example. It is driving an entire ecosystem of solutions and innovation that can be tapped into from an investment strategy.
Data Science Solutions Definition
Data science is an interdisciplinary field focused on extracting knowledge from typically large data sets and applying the knowledge and insights from that data to solve problems in a wide range of application domains. The field encompasses preparing data for analysis, formulating data science problems, analyzing data, developing data-driven solutions, and presenting findings to inform high-level decisions in various application domains. As such, it incorporates skills from computer science, statistics, information science, mathematics, data visualization, information visualization, data integration, complex systems, communication, and business.
The above definition has a broad spectrum to it. For our purpose of identifying relevant solutions, we will be looking for the following areas:
✅ End-to-end infra and app testing and visibility, specifically in the end-to-end data space.
✅ Data Quality solutions, Data Contracts, Data Mesh.
✅ Smart Data Catalog and Semantic Layers.
✅ Data Portal and self-discovery solutions.
✅ Generative AI models and Platforms. LLMOps, MLOps and DataOps solutions.
✅ ETL/Reverse-ETL pipelines and solutions.
✅ New Storage layer for optimizing data querying and cost savings.
✅ New Data storage layers and smart caching solutions.
✅ Data warehouse, integrations, and related systems (such as Snowflake alternatives, optimization, cost reduction, object storage, and other solutions)
✅ Data analytics visualization tools and insights automation to reduce the need for data analysts. For example, Generative solutions for natural language to visualization and SQL queries.
✅ Machine Learning platforms focused on the Data Engineer persona and the data developer.
✅ Security solutions that are very specific to securing information at rest and provide different authentication and access management at the lowest granularity.
What we will not be considered in the data space:
- Classic Observability related monitoring systems.
- Point solution algorithms.
- Mathematical implementations of research papers that are hard to implement for generic use cases and broad usage in Data platforms.
- Visualization libraries and solutions that require manual discovery compete with huge space players.
- Generic security solutions that have some plugins in the data space.
Goals
- Lead Data related seed investments around the world when Israelis are part of the core founding team ( we will make exceptions in specific cases).
- Lead Data related seed investments in Europe as we expand beyond Israeli founders.
- Create an Ideation/Incubation program for mature and experienced founders that wish to solve problems in the Data space (and be the first money in the deal).
Next:
Operating Plan