Sunday, 24 May 2015

Big Data: Big Deal

In a world run by computers, and everything happening on the Internet, there is a giant, hidden and precious resource that is being generated every moment and every one of us is contributing to it. We know this resource by the name of information or data.  Big Data is one of the new hot terms in the jargon of Internet/Computer science literature.

What is Big Data?
In the field of computer science, every piece of information/media constitutes data. However, for data to be called Big Data, it needs to satisfy the soft criteria of 3 Vs i.e. Volume, Velocity and Variety:
1.     Volume: As the name Big Data suggests, data size should be very large, generally of the order of Petabytes.
2.     Velocity: The data is generated at a very high rate, generally of the order of Gigabytes per second.
3.     Variety: Big Data generally consists of large variety data, mostly unstructured.

Who is generating Big Data?
We are. To understand how we generate big data, we need to know what actually constitutes big data. In most cases, big data is user profile, user preferences and user activity data, where a user means someone who is using a particular service on the web or outside of web. Billions of people are generating plethora of information every second through their interactions with different services that they use. With the advent of Internet of Things (IoT), a world where the vast majority of gadgets, machines and humans are connected to the internet, big data provides a promising future in terms of decisions based on big data.

Why all the fuss?
Data has lately emerged not only as a resource, but also as a precious commodity over past few years. We can only guess how precious this is as a commodity, I would not be wrong to say that it rivals all big commodities in the market like oil, gold etc. and has the potential to beat all these commodities (combined) in terms of gross global value in near future. Some people might think this is too bold of a statement, but let me give some pointers to think about:
Where do you think ALMOST ALL OF THE REVENUE of tech giants like Google and Facebook comes from, when they are not charging anything from the end user? Why do you think the government of India is so keen on investing in UID scheme, smart cities etc., when such basic problems likes illiteracy, discrimination etc. remain unsolved by a great margin? In fact, why do you think most of the services on the Internet are free for the end user?

The Big Data revolution: data never lies
Like every other precious commodity of such a wide impact, Big Data also has the potential to transform the world. If we really look closely, many of our decisions and our behavior are already being governed by data.
As the popular saying goes “Data never lies”, data is already being used by policy makers in progressive governance and big organizations to implement changes, attract people, transform behaviors and eliminate competitions. It would not be an exaggeration if I say that intelligent analysis of data is the key to a successful administration and a cutting edge business strategy in this age of information. In other words, we are going through a Big Data revolution, where data is one of the primary drivers of change, both positive (as we have already seen) as well as negative (as we will see in the next section).

Well there is a darker side too…
As some of the curious readers would have already guessed that, like any other commodity having an ability of such a huge impact on people’s lives, Big Data also comes with a cost and a set of challenges that can not be ignored.
1.    Greed vs Privacy: Since big data is a huge source of revenue, it is very tempting to cross boundaries of user privacy when it comes to using their personal data for filling pockets. As precious it is a commodity, it can not only be used by large corporations who already have a huge source of big data, but also can be sold for insane prices to malicious clients.

2.    Data Colonialism: As it happens with a commodity of such an impact, people with power over it don’t want to let it go.
Large corporations like Google and Facebook already have a soft monopoly over Big Data, but it remains to be seen whether they will use it (or are already using it) to not only generate revenues but also crush competition. However, the scenario here is not as bad as it used to be with oil, since the sources of big data as a commodity are not limited (at least as long as net neutrality is maintained, which is also a big issue of debate nowadays). The more worrisome phenomena is the colonization of the analog universe by the digital. The term Data Colonialism was given by Sorabji in 2013 to describe a scenario where the West has been mining African nations for health data without the African benefiting in any way. This was the case with raw materials extraction from colonies in 18th century- extraction of value.

3.    Transforming behavior: We have already entered the era where advertisements are powered by artificial intelligence which makes use of past user behavior to show advertisements that are more likely to impact user behavior towards a certain product, person, organization, campaign etc. That said, with the power that Big Data provides to large corporations and governments, it potentially provides a powerful tool to modify human behavior on a large scale for their benefits.

The problem of privacy violations can be solved to a large extent by imposing regulations regarding user privacy and performing audits whether those laws are adhered to. However, it is a big challenge since data often crosses national boundaries and there are technological limitations to imposing a law to such an effect.
Data colonialism is a very real possibility but not much can be done other than providing support to competing businesses to maintain an environment of open competition so that such situations do not arise. Although, use of data to transform human behavior for personal gains seems ethically wrong, but law cannot be used to counteract such a practice especially when it is done with user consent. All we can do is to make people aware of the potential risks and let them decide their courses of action.

In short…

In short, Big Data is a very precious commodity and a powerful tool to drive positive changes and lead to a world that runs on intelligent decisions rather than whims and fancies of people. However, this power comes with it’s own set of risks which we need to be aware of as the primary producers of this resource and be vigilant about how this data is being used.

No comments:

Post a Comment

(Don’t) Waste the Thunder, Recycle

6:30 PM, I pressed the doorbell of my flat. As I was waiting for my brother to open the door, my eyes fell on a dozen cold drink bottles ...