Guest post by Megan Ray Nichols
Cloud storage and other big data creations have changed the way we look at and store information. What impact will these changes have on the pharmaceutical industry? Will these changes be able to alter the industry for the better, or could they possibly present new problems?
Waiting for Answers
One of the biggest problems researchers face in the pharmaceutical industry is the fact that information, in general, is treated as a proprietary and closely guarded secret. Individual researchers and companies spend a lot of time keeping their data protected from outside influence. They spend so much time protecting their information that when it comes time to share with the public, the investors, or other pharmaceutical companies, it becomes difficult or nearly impossible to disseminate the information.
Many companies have started to share their raw clinical trial information with the industry, but it’s a slow process. In the meantime, the data that might lead to the next wonder drug or medical breakthrough is sitting in limbo, gathering virtual dust because it can be so difficult to access.
Genomics and Data
One of the biggest uses for big data in the pharmaceutical industry is in the field of genomics. You need a lot of space and quite a bit of computing power to sequence a human genome — when you’re dealing with 25,000 genes and three billion base pairs of DNA, you’re looking at about 1.5 gigabytes of storage per genome sequenced. To put it in perspective, that’s about the size of a 1080p movie file.
If you’re sequencing the genomes of a couple hundred test subjects, you may find yourself carting around dozens of heavy hard drives in order to carry all of that information. Alternatively, though, you could look into cloud data storage.
The ability to share genomic data via cloud technology has two main benefits. First, it cuts down on the amount of physical storage space you need. All that’s required is a computer with Internet access to connect to all of your data.
Second, it makes it easier and faster to share raw research data with the rest of the pharmaceutical community. If a researcher in Tokyo makes a discovery that could shake the entire industry to its core, they don’t have to sit through the peer review process, waiting weeks or months to publish a paper that could change the world. All they have to do is upload their research to the cloud. It’s as simple as that.
Now, most researchers aren’t uploading their publishable discoveries, preferring instead to share their raw research data, but the platform is still there to provide a stepping stone for genomic data discoveries.
Quantifying Data
While cloud storage is a great platform for sharing research data, that’s not the only thing it’s good for. It can also be used to help researchers quantify the raw data that has been placed in the cloud.
Having sequenced genome information stored for a variety of different subjects is great, but it can be a bit daunting to sift through if you’re specifically looking for subjects of a specific race, gender or age.
Cloud storage, when paired with a little bit of simple software, allows researchers to search through the stored data to find specific traits without having to pick through each individual genome to find what they’re looking for. Cambridge Semantic’s program Semantic Web is just one of the tools researchers can use to sift through the raw data to find the traits they’re looking for.
Crowdsourcing Our Genetics
Crowdsourcing has become a great tool for people who need to raise money, gather information, or in many cases, even make dramatic scientific discoveries. Stanford’s Folding@Home Project, for example, has been using personal computers around the globe for 16 years to find the answers to puzzles that have otherwise eluded scientists.
FoldIt, on the other hand, is a more interactive game that allows users to actively work toward the solution rather than watching the proteins passively fold. In 2011, users found the answer to a problem that had eluded scientists for 15 years. The amazing part is that collectively, FoldIt players were able to find the answer in three weeks.
If big data can do that for the pharmaceutical industry while just using random Internet visitors, imagine what they could do with a crowd of industry professionals around the globe?
Risks vs. Rewards
Bringing big data into the pharmaceutical industry has the potential for great rewards. Unfortunately, whenever you bring the Internet into the data equation, there’s always some risk as well.
The first risk you take is that the data you’re going to find is just random junk with no real use or application. Even if you severely limit the number of people who can access your data cloud, there’s still a chance that someone will upload some useless data that could potentially skew any and all results.
The second, and arguably the most dangerous risk, is data privacy. All it takes is one person with nefarious intentions gaining access to your data cloud, and all of your patient’s information could be at risk. Hackers are targeting medical information now more than ever because it’s more valuable than credit card information and not checked nearly as often.
It is possible to reduce this risk by removing personal information from the data, beyond the bits of information needed to classify the raw information. Removing names, insurance information and other personal identifiers can help to protect your subjects while still allowing you to take advantage of the raw study data.
What It Means for Big Data and Pharmaceuticals
Overall, the use of Big Data in the pharmaceutical industry is going be a force for great good and the launching point for many ingenious advances in the industry. There’s no telling what amazing things will fall from that cloud next!
Cloud storage and other big data creations have changed the way we look at and store information. What impact will these changes have on the pharmaceutical industry? Will these changes be able to alter the industry for the better, or could they possibly present new problems?
Waiting for Answers
One of the biggest problems researchers face in the pharmaceutical industry is the fact that information, in general, is treated as a proprietary and closely guarded secret. Individual researchers and companies spend a lot of time keeping their data protected from outside influence. They spend so much time protecting their information that when it comes time to share with the public, the investors, or other pharmaceutical companies, it becomes difficult or nearly impossible to disseminate the information.
Many companies have started to share their raw clinical trial information with the industry, but it’s a slow process. In the meantime, the data that might lead to the next wonder drug or medical breakthrough is sitting in limbo, gathering virtual dust because it can be so difficult to access.
Genomics and Data
One of the biggest uses for big data in the pharmaceutical industry is in the field of genomics. You need a lot of space and quite a bit of computing power to sequence a human genome — when you’re dealing with 25,000 genes and three billion base pairs of DNA, you’re looking at about 1.5 gigabytes of storage per genome sequenced. To put it in perspective, that’s about the size of a 1080p movie file.
If you’re sequencing the genomes of a couple hundred test subjects, you may find yourself carting around dozens of heavy hard drives in order to carry all of that information. Alternatively, though, you could look into cloud data storage.
The ability to share genomic data via cloud technology has two main benefits. First, it cuts down on the amount of physical storage space you need. All that’s required is a computer with Internet access to connect to all of your data.
Second, it makes it easier and faster to share raw research data with the rest of the pharmaceutical community. If a researcher in Tokyo makes a discovery that could shake the entire industry to its core, they don’t have to sit through the peer review process, waiting weeks or months to publish a paper that could change the world. All they have to do is upload their research to the cloud. It’s as simple as that.
Now, most researchers aren’t uploading their publishable discoveries, preferring instead to share their raw research data, but the platform is still there to provide a stepping stone for genomic data discoveries.
Quantifying Data
While cloud storage is a great platform for sharing research data, that’s not the only thing it’s good for. It can also be used to help researchers quantify the raw data that has been placed in the cloud.
Having sequenced genome information stored for a variety of different subjects is great, but it can be a bit daunting to sift through if you’re specifically looking for subjects of a specific race, gender or age.
Cloud storage, when paired with a little bit of simple software, allows researchers to search through the stored data to find specific traits without having to pick through each individual genome to find what they’re looking for. Cambridge Semantic’s program Semantic Web is just one of the tools researchers can use to sift through the raw data to find the traits they’re looking for.
Crowdsourcing Our Genetics
Crowdsourcing has become a great tool for people who need to raise money, gather information, or in many cases, even make dramatic scientific discoveries. Stanford’s Folding@Home Project, for example, has been using personal computers around the globe for 16 years to find the answers to puzzles that have otherwise eluded scientists.
FoldIt, on the other hand, is a more interactive game that allows users to actively work toward the solution rather than watching the proteins passively fold. In 2011, users found the answer to a problem that had eluded scientists for 15 years. The amazing part is that collectively, FoldIt players were able to find the answer in three weeks.
If big data can do that for the pharmaceutical industry while just using random Internet visitors, imagine what they could do with a crowd of industry professionals around the globe?
Risks vs. Rewards
Bringing big data into the pharmaceutical industry has the potential for great rewards. Unfortunately, whenever you bring the Internet into the data equation, there’s always some risk as well.
The first risk you take is that the data you’re going to find is just random junk with no real use or application. Even if you severely limit the number of people who can access your data cloud, there’s still a chance that someone will upload some useless data that could potentially skew any and all results.
The second, and arguably the most dangerous risk, is data privacy. All it takes is one person with nefarious intentions gaining access to your data cloud, and all of your patient’s information could be at risk. Hackers are targeting medical information now more than ever because it’s more valuable than credit card information and not checked nearly as often.
It is possible to reduce this risk by removing personal information from the data, beyond the bits of information needed to classify the raw information. Removing names, insurance information and other personal identifiers can help to protect your subjects while still allowing you to take advantage of the raw study data.
What It Means for Big Data and Pharmaceuticals
Overall, the use of Big Data in the pharmaceutical industry is going be a force for great good and the launching point for many ingenious advances in the industry. There’s no telling what amazing things will fall from that cloud next!
No comments:
Post a Comment
Pharmaceutical Microbiology Resources