Top Big Data opportunities for health startups
VN:R_U [1.9.22_1171]
  • 0

Frank Boermeester Frank Boermeester - May 23 2012

Big Data is a hot topic in healthcare but what exactly is happening in the field? Where are the opportunities for startups? This article offers an overview of the key opportunities as we see them. We’d love to hear you feedback: do you agree? Have we missed anything? Let us know (frank at

Big Data’s Holy Grail: personalised medicine

Firstly, there are a number of big projects that involve big technology companies, big healthcare providers and big pharmaceutical companies. These projects are tackling what seems to be the holy grail of Big Data in healthcare: linking electronic health record data, DNA sequencing data, treatment guidelines and published research, with the intent of enabling personalized medicine.

Major hospitals are finding that their data volumes are exploding since they started using EHR systems, increasingly advanced imaging technology and DNA sequencing. Some serious hardware is required if that data is to be made available to researchers and developers of clinical decision-support applications.

Hence, IBM is making its Watson supercomputer available to the Memorial Sloan-Kettering Cancer Center to develop a decision-support application for cancer treatment.  The oncologists will be using Watson to gather patient information (they already have a database of more than 1 million patients), treatment guidelines and published research with the aim of developing personalized treatment options.

In a similar line, Dell has announced that it is making its cloud infrastructure available to The Translational Genomics Research Institute (TGen) for what they are calling the world’s first personalized medicine trial for pediatric cancer (12 medical centers will  be enrolling patients).  Here the researchers will be incorporating genome data (which means that more than 200 billion measurements per patient will need to be analyzed) to find the most appropriate treatments for individual patients.

Data management and analytical tools for hospitals

Most of the operational and clinical data collected by healthcare institutions is locked up  in closed systems that weren’t designed for data analysis.  There’s clearly an opportunity to help hospitals clean up their data, merge it with other data sets and prepare it for easy analysis.  For example, Explorys, founded in 2009, is a rapidly growing Big Data PaaS/SaaS provider to many major US hospitals. Essentially they curate their customers’ clinical and operational data opening it up to real-time exploration and predictive analytics.

Open Data

In the Netherlands there is an interesting Open Data initiative by the city of Almere. Almere DataCapital, as the project is called, is a collaboration between health providers, government and knowledge institutions (and several ICT companies) to develop a vendor-neutral Big Data platform. The idea here is not simply to make healthcare data manageable and ready for analysis, but to open that data up to third party developers in a neutral way. Expect a lot more public authorities to do similar things and expect a lot more data to become available.  Great stuff for hackathons (check out

Clinical interfaces

The increasing amount of data being collected within the healthcare system is a good thing, but it could also overwhelm healthcare workers if the resulting insights and applications aren’t delivered in a user friendly manner. Already there are too many diverse and highly complex interfaces in a typical hospital.  The trick will be to integrate data from different systems and deliver it via interfaces that look more like an iPad than a mainframe terminal from the 1960s.

Clinical decision support using external data

There’s a lot of medical information out there for clinicians, but much of it is locked up in unwieldy formats (clinical trials data, medical journals, books, etc).  To address that problem,  Agile Diagnosis brings clinical decision support (integrating clinical best practice data) to a tablet PC.  More ambitiously, Nantworks is partnering with Verizon to create a cloud database of cancer treatment protocols so that doctors can consult such data from their mobile devices – to be called the Cancer Knowledge Action Network.  NantWorks says that it will be turning masses of data about patients, procedures and treatments into accessible information for the user.

Medical imaging

Medical imaging technology is advancing rapidly (shifting from regular imaging to functional imaging such as fMRI, SPECT, PECT etc.) but is leading to an overwhelming volume of heterogeneous imaging data.  Interesting research is being conducted by institutions such as IBBT in Belgium to discover new diagnostic applications of such imaging data, for example by combining different types of imaging data or by analyzing and visualizing such data in new ways. Is there opportunity for companies with specialized skills in imaging data analytics to make a contribution?

Genome data

With the declining costs of genome sequencing technology, the volume of biological data is about to explode.  There should be plenty of opportunity in developing computational methods for organising such data from different sources, for integrating data in clever ways, for analysing and visualising such data and for applying such data for diagnosis and disease discovery.

Remote patient monitoring data

In principle it is  possible to monitor a myriad of patient characteristics on a remote basis (via medical devices, wearable devices, home  sensing devices, video monitoring, etc) but someone needs to make sense of all data in an efficient way for telemedicine and telecare to become feasible.  A lot of  clever software will be needed to help clinicians and patients make use of remote monitoring data.

The Big Data opportunity in every health gadget and app

While the above projects are geared toward the analysis of data that is currently being collected by the healthcare system–hospitals, clinicians, administrators–there is another class of projects where the data is being collected from and by consumers-patients directly via various self-tracking apps, social networks and personal health records. And people are  clearly tracking an increasingly diverse range of data, including vital signs, symptoms, temperature, movement, sleep, mood, medication intake, and so on.

Take for example. Being developed by the USC Center for Body Computing, Everyheartbeat will be a platform that lets anyone log their heart rate data using their wireless phone. Planned for launch in late 2013, Everyheartbeat hopes ultimately to connect more than 5 billion mobile phones to the health ecosystem. The vision behind the project is enticing.  Imagine a future where everyone’s body is permanently connected to the network, tracking health data on an ongoing basis. Imagine the wealth of data this will generate for medical research.  And imagine the benefits for you and me: owning and understanding your health data, predicting potential illness and taking early preventative action, your doctor calling you for an appointment as opposed to the other way round.  What is also really interesting about these types of projects is that they aim to collect data from the general (healthy) population; data collected by the healthcare system is by definition slanted toward people with illness.

While the primary intent of is to collect lots of data, in the hope of delivering services later, the model in reverse is applicable to practically any digital health service that needs to collect personal health data as parts of its service.  For example, Cambridge Temperature Concepts sells a fertility monitoring device to help women detect when they are at the most fertile stage of their cycle.  That is its core value proposition at present. However, the company is also collecting masses of valuable medical data. The device tracks each user’s movement and temperature 24 hours a day. Furthermore, users also report any symptoms or circumstances that they feel could have an impact on their body temperature.  As such, this data could be tremendously useful for fertility research, or in totally different areas (e.g. as a healthy control group).

The best known cases that illustrate this ‘tangential’ big data opportunity are probably the social networks PatientsLikeMe and CureTogether (who are already putting their aggregated user data to use for research purposes), but the principle could apply to any service that is collecting health data.

Especially exciting are initiatives to make this personal sensing data available to third party developers (akin to Almere DataCapital but as opposed to clinical data being placed in a big data platform, it is data captured by personal sensing devices). Greengoose, for example, makes sensing body stickers but allows third-party developers create the apps that could feasibly make use of that data (via an API). In the years ahead a lot more ‘open’ data should become available from such sensing platforms, from smart phones obviously, but also from social networks and personal health records.  Hence there should be opportunities for middleware companies that integrate data from various sources and make it available to developers (such as Sense-os), and for developers obviously who create clever applications that rely on data from various sources.

HealthStartup on Big Data

At our next HealthStartup event in Nijmegen in the Netherlands (26 June 2012) we will be focusing on Big Data. To be clear, we’re interested in all ‘types’ of Big Data projects described here (and others we may have missed).  Some projects are likely to be Big Data  projects by definition because their core purpose is  to gather, integrate and analyze vast datasets.  There will be others, however, who currently do different things (e.g. make a body monitoring device) but have a longer-term Big Data strategy (e.g. making use  of aggregated sensing data).  With regard to the latter we don’t expect startups to be actively processing and analyzing vast datasets just yet, but it will be interesting to hear about their strategy in this area.  For example, what do you intend to do with all that data once you have it? What business model will you pursue (e.g. sell data like PatientsLikeMe or publish an API like Greengoose to drive demand for your sensor devices)? Have you given thought to the privacy and legal issues? And what about the technical aspects? Do you intend to partner with other data owners/aggregators? Questions such as these will be the focus of the event in June. It will be an opportunity to explore the big data opportunity and evaluate different strategies; and if you’re a pitching startup – to receive in-depth, focused feedback on improving those strategies.


Register now to comment

Got a question?

Send a message to our support service

Your message has been sent

Please fill in the form