Petabyte

Posts

AI Speeds Up Our Rendezvous With Complexity

October 21, 2019

Intelligent systems are augmenting our ability to make decisions. This ability to make decisions is based on machine learning models or deep learning models that use large amounts of data that are non-parametric that fit into non-linear functions. These models are hard to comprehend and explain. We are plugging a lot of these models into business processes in every industry. Soon it will be hard to monitor and justify the root of a decision made because of these models. Take, for example, a step in a process where a human decides. We can ask the human about the reason for the decision. There is also discretion on the decision based on the ethics of the one who decides. The decision might not be as consistent or efficient to maximize any benefit after that decision, but at least somebody can explain the details about the decision and understand the reason behind it. Now let's take the data behind past human choices and create a model using AI in an intelligent...

AI's Part In Our Evolution

October 16, 2019

For the longest time, our ancestors have transferred a lot of knowledge through social learning. Social learning in a family or community setting builds the foundation for adaptability as an adult. An extended childhood with a lot of opportunities for social learning is one of our adaptation. ( https://www.americanscientist.org/article/the-benefits-of-a-long-childhood ). All humans learn socially and individually. We use whatever we learn to guide our behavior or decision to what we think would be beneficial to us. The context of social learning is an ever-changing landscape. The setting of our social learning and the influences that we have from people are now shifting online. With just a few clicks of the button, you can learn from professors from ivy league universities, consult with a doctor, or even cook dinner with a world-class chef. Gone are the days that you learn your trade from your father or immediate family. You no longer are bound by the tradition of your ancestry o...

Ethics and Fairness in Artificial Intelligence

October 14, 2019

We are at a time where leveraging Artificial Intelligence becomes commonplace. Every business organization is now looking into its data and building intelligent agents that predict, recommend, and even decide on organizational transactions. Automated intelligent systems are under scrutiny because of the recent biases uncovered from systems rolled out by some of the giants in tech right now. It is being proposed that such intelligent agents are regulated, developed, and implemented ethically. Equality, non-discrimination, accountability, and safety are being suggested as guiding principles when building such intelligent agents. Regulating automated intelligent systems is going to be huge in impact in every industry. Need To Be Accountable Most intelligent systems are an ensemble of Machine Learning Algorithms or Deep Learning Neural Networks. Which fundamentally is fitting a function over past data. Some are from a math formula developed through simulation or past d...

Predicting Helpful Posts

July 15, 2019

Original paper: https://www.aclweb.org/anthology/N19-1318 Here is a quick summary: The research purpose is to identify helpful posts from discussion threads in forums, especially long-running discussions. The approach is to model the relevance of each post concerning the original post and the novelty (not presented in the earlier posts of the discussion thread)of a post based on a windowed context. To model, the 'relevance' the original post and the target post are encoded using an RNN (GRU). The encoded sequences are then element-wise multiplied. As for modeling of the 'novelty,' the target post and the past K posts (where K is the number of past posts taken into context. A 'K' between 11 to 7 worked best for the Reddit dataset used in the experiment - performance stops improving after a certain number of posts taken into context) are also encoded using the same RNN text encoder. Once the 'K' posts are text encoded it is then fed thru another R...

Abusive Language Detection

July 15, 2019

Original paper - https://www.aclweb.org/anthology/N19-1221 Here is a quick summary: In the paper submitted by Facebook AI, London to the recent NAACL (North American Chapter of the Association for Computational Linguistics) conference held in Minneapolis, they presented a novel approach using Graph Convolutional Networks to outperform some the best ways to detect abusive language on the internet. The approach made use of a heterogeneous graph that contains an authors community network and tweets. The graph is then used to predict the class and generate an embedding. In the paper's experiments, the researchers used embeddings from node2vec (sample implementation here https://snap.stanford.edu/node2vec/) and a 2-layer Graph Convolutional Network. The Graph Convolutional Network that represents the author's profiles and tweets were used to predict the author's tweet into three classes using a softmax layer as the output layer of the network. To extract the embedding f...

Fakes News and A New Life Philosophy

December 19, 2018

When I worked on Reuters Tracer for a little bit, one of the hot item topics during that time is detecting Fake News. Fake News is a real issue for people wanting reliable information and data. People who want reliable information and data are often decision makers, who need to take action. I tried to look into what is "Fake News" which lead me to a new life philosophy. This new philosophy - is a moral imperative to check every information we consume before we believe because beliefs are the basis of our action. In this day and age that everyone is streamed with information all day thru social media, adoption of this philosophy becomes a necessity. Why does anyone want to create "Fake News"? The short answer to this is to win people over. "Fake News" mostly appeal to the target audience emotions. You can't win people over with logic, facts, and data. Have you ever been riled up to work harder when presented with charts and graphs of our e...

Graph Algorithms - Strongly Connected Components in Spark 2

August 12, 2017

Ever since I generated doc2vec (Word Embeddings) for our documents, we found interesting things by doing computations and comparisons to these vectors. For example, we try to find similar documents using cosine similarity and other similarity measures. These representations of the document give us the flexibility of doing a lot of stuff. We tried using the vectors in an ANNOY index to find near neighbors for a document quickly. Now I am exploring these same vectors in finding documents that are repeatedly written and discusses the same topic. If I want to find these documents I figured that these documents would be closely similar. Documents that address the same topic will probably have a set of standard vocabulary. What if I want to find the most influential document among these related documents? To do that we need to define the connections between these documents. When we say "connections," I can't help but think of a Graph (or Network). Another approach is to us...