Quantcast
Channel: Ajit Vadakayil
Viewing all articles
Browse latest Browse all 852

WHAT ARTIFICIAL INTELLIGENCE CANNOT DO , a grim note to the top 100 intellectuals of this planet , Part 6 - Capt Ajit Vadakayil

$
0
0


THIS POST IS CONTINUED FROM PART 5, BELOW--

https://ajitvadakayil.blogspot.com/2019/11/what-artificial-intelligence-cannot-do_4.html




Generative adversarial networks (GANs) are deep neural net architectures comprised of two nets, pitting one against the other (thus the “adversarial”).




GANs’ potential is huge, because they can learn to mimic any distribution of data. That is, GANs can be taught to create worlds eerily similar to our own in any domain: images, music, speech, prose. They are robot artists in a sense, and their output is impressive

In a surreal turn, Christie’s sold a portrait for $432,000 that had been generated by a GAN, based on open-source code written by Robbie Barrat of Stanford.

In 2019, DeepMind showed that variational autoencoders (VAEs) could outperform GANs on face generation.




Autoencoders encode input data as vectors. They create a hidden, or compressed, representation of the raw data. They are useful in dimensionality reduction; that is, the vector serving as a hidden representation compresses the raw data into a smaller number of salient dimensions. 

Autoencoders can be paired with a so-called decoder, which allows you to reconstruct input data based on its hidden representation, much as you would with a restricted Boltzmann machine.





To understand GANs, you should know how generative algorithms work, and for that, contrasting them with discriminative algorithms is instructive. Discriminative algorithms try to classify input data; that is, given the features of an instance of data, they predict a label or category to which that data belongs.

Meanwhile, the generator is creating new, synthetic images that it passes to the discriminator. It does so in the hopes that they, too, will be deemed authentic, even though they are fake. The goal of the generator is to generate passable hand-written digits: to lie without being caught. The goal of the discriminator is to identify images coming from the generator as fake.

Here are the steps a GAN takes:--- 
The generator takes in random numbers and returns an image.
This generated image is fed into the discriminator alongside a stream of images taken from the actual, ground-truth dataset.
The discriminator takes in both real and fake images and returns probabilities, a number between 0 and 1, with 1 representing a prediction of authenticity and 0 representing fake.

So you have a double feedback loop:--

The discriminator is in a feedback loop with the ground truth of the images, which we know.
The generator is in a feedback loop with the discriminator.

You can think of a GAN as the opposition of a counterfeiter and a cop in a game of cat and mouse, where the counterfeiter is learning to pass false notes, and the cop is learning to detect them. Both are dynamic; i.e. the cop is in training, too (to extend the analogy, maybe the central bank is flagging bills that slipped through), and each side comes to learn the other’s methods in a constant escalation.

One neural network, called the generator, generates new data instances, while the other, the discriminator, evaluates them for authenticity; i.e. the discriminator decides whether each instance of data that it reviews belongs to the actual training dataset or not.

When you train the discriminator, hold the generator values constant; and when you train the generator, hold the discriminator constant. Each should train against a static adversary. For example, this gives the generator a better read on the gradient it must learn by.

Each side of the GAN can overpower the other. If the discriminator is too good, it will return values so close to 0 or 1 that the generator will struggle to read the gradient. If the generator is too good, it will persistently exploit weaknesses in the discriminator that lead to false negatives. This may be mitigated by the nets’ respective learning rates. The two neural networks must have a similar “skill level.”

GANs take a long time to train. On a single GPU a GAN might take hours, and on a single CPU more than a day.


Generative Adversarial Networks – describe pairs of alternately trained models using competing deep learning algorithms. Here, the first model is trained using a second model, to discriminate between actual data and synthetic data. This ability to capture and copy variations within a dataset can be applied for uses such as understanding risk and recovery in healthcare and pharmacology.

Adversarial machine learning is a technique employed in the field of machine learning which attempts to fool models through malicious input. This technique can be applied for a variety of reasons, the most common being to attack or cause a malfunction in standard machine learning models.



Machine learning techniques were originally designed for stationary and benign environments in which the training and test data are assumed to be generated from the same statistical distribution. 

However, when those models are implemented in the real world, the presence of intelligent and adaptive adversaries may violate that statistical assumption to some degree, depending on the adversary. 

This technique shows how a malicious adversary can surreptitiously manipulate the input data so as to exploit specific vulnerabilities of learning algorithms and compromise the security of the machine learning system

In the old days, the voice generated by computers did not sound human, and the creation of a voice model required hundreds of hours of coding and tweaking. Now, with the help of neural networks, synthesizing human voice has become less cumbersome.

The process involves using generative adversarial networks (GAN), an AI technique that pits neural networks against each other to create new data.

GANs are an exciting and rapidly changing field, delivering on the promise of generative models in their ability to generate realistic examples across a range of problem domains, most notably in image-to-image translation tasks such as translating photos of summer to winter or day to night, and in generating photorealistic photos of objects, scenes, and people that even humans cannot tell are fake.

First, a neural network ingests numerous samples of a person’s voice until it can tell whether a new voice sample belongs to the same person. Then, a second neural network generates audio data and runs it through the first one to see if validates it as belonging to the subject. 

If it doesn’t, the generator corrects its sample and re-runs it through the classifier. The two networks repeat the process until they are able to generate samples that sound natural.

 For instance, companies are using AI-powered voice synthesis to enhance their customer experience and give their brand its own unique voice. In the field of medicine, AI is helping ALS patients to regain their true voice instead of using a computerized voice. 




AI speech synthesis also has its evil uses. Namely, it can be used for forgery, to place calls with the voice of a targeted person, or to spread fake news by imitating the voice of a head of state or high-profile politician.

GANs can generate realistic images by training a  'detective ANN' to recognise whether a picture was produced by a human or a computer, then  training a 'forger ANN' to produce images, which are tested by the detective ANN. 



In the process,  the pair of ANNs both get better, the detective at identifying fake images and the forger at  producing realistic images. The forger can be used to develop various image modification and production tools, for example to create aged versions of real faces, or to generate completely novel imagined faces.

The same principles have been applied to train ANNs to produce realistic sounds, such as voice impersonation, or videos, adding (imagined) movements to photographs.  Such techniques can be used to generate extremely realistic AI-generated videos known as  'deepfakes', which have been used to produce fake pornographic videos featuring celebrities, and  videos that appear to show politicians making statements

The problem with neural networks, however, is that the way they develop their pattern recognition behavior is very complex and opaque. And despite their name, neural networks work in ways that are very different from the human brain. That’s why they can be fooled in ways that will be unnoticed by humans.



Adversarial examples are input data manipulated in ways that will force a neural network to change its behavior while maintaining the same meaning to a human observer. For instance, in the case of an image classifier neural network, adding a special layer of noise to an image will cause the AI to assign a different classification to it.

During adversarial training, engineers expand neural networks. For each layer, they add more neurons to memorize the mistakes and make the AI model more robust

Hackers have been tricking AI systems using what is termed “adversarial attacks” where there is an added layer (the adversary) onto data, such as an extra layer of noise on an image

Companies like FedEx and Sprint are also using predictive analytics to pinpoint customers who are “flight risk” factors and may defect to a competitor. This can trick the AI’s algorithms and make it misclassify the image, which can then let malicious code enter.


Pre-empting criminals attempting to hijack artificial intelligence by tampering with datasets or the physical environment, researchers have turned to adversarial machine learning. This is where data has been tweaked to trick a neural network and fool systems into seeing something that isn't there, ignoring what is, or misclassifying objects entirely.

Currently, there is not a concrete way for defending against adversarial machine learning; however, there are a few techniques which can help prevent an attack of this type from happening. Such techniques include adversarial training, defensive distillation.

Adversarial training is a process where examples adversarial instances are introduced to the model and labeled as threatening. This process can be useful in preventing further adversarial machine learning attacks from occurring, but require large amounts of maintenance.

Defensive distillation aims to make a machine learning algorithm more flexible by having one model predict the outputs of another model which was trained earlier.  This approach can identify unknown threats. 

It is similar in thought to generative adversarial networks (GAN), which sets up two neural networks together to speed up machine learning processes—in the idea that two machine learning models are used together.  This happens very frequently – some of the most advanced spammer groups try to throw the Gmail filter off-track by reporting massive amounts of spam emails as not spam

Adversarial data describes a situation in which human users intentionally supply an algorithm with corrupted information. The corrupted data throws off the machine learning process, tricking the algorithm into reaching fake conclusions or incorrect predictions.


UC Berkeley professor Dawn Song notably tricked a self-driving car into thinking that a stop sign says the speed limit is 45 miles per hour.

A malicious attack of this nature could easily result in a fatal accident. Similarly, compromised algorithms could lead to faulty biomedical research, endangering lives or delaying life-saving innovations.

Adversarial data has only recently begun to be recognized for the threat it is — and it can’t go overlooked any longer.

Interestingly, adversarial data output can occur even without malicious intent. This is largely because of the way algorithms can “see” things in the data that we humans are unable to discern. Because of that “visibility,” a recent case study from MIT describes adversarial examples as “features” rather than bugs.

In the study, researchers separated “robust” and “non-robust” characteristics during AI learning. Robust features are what humans typically perceive, while non-robust features are only detected by AI. An attempt at having an algorithm recognize pictures of cats revealed that the system was looking at real patterns present in the images to draw incorrect conclusions.

The misidentification occurred because the AI was looking at an apparently imperceivable set of pixels that led it to improperly identify photos. This caused the system to be inadvertently trained to use misleading patterns in its identification algorithm.

These non-robust characteristics served as a type of interfering “noise” that led to flawed results from the algorithm. As a result, for hackers to interfere with AI, they often simply need to introduce a few non-robust characteristics — things that aren’t easily identified by human eyes, but that can dramatically change AI output.

Adversarial attacks could induce an algorithm to incorrectly label harmful or contaminated samples as clean and benign. This can lead to misguided research results or incorrect medical diagnoses.

Despite these issues, adversarial data can also be used for good. Indeed, many developers have begun using adversarial data to uncover system vulnerabilities on their own, allowing them to implement security upgrades before hackers can take advantage of the weakness. Developers are using machine learning to create AI systems that are more adept at identifying and eliminating potential digital threats.

At a high level, attacks against classifiers can be broken down into three types:

Adversarial inputs, which are specially crafted inputs that have been developed with the aim of being reliably misclassified in order to evade detection. Adversarial inputs include malicious documents designed to evade antivirus, and emails attempting to evade spam filters.

Data poisoning attacks, which involve feeding training adversarial data to the classifier. The most common attack type we observe is model skewing, where the attacker attempts to pollute training data in such a way that the boundary between what the classifier categorizes as good data, and what the classifier categorizes as bad, shifts in his favor. 

The second type of attack we observe in the wild is feedback weaponization, which attempts to abuse feedback mechanisms in an effort to manipulate the system toward misclassifying good content as abusive (e.g., competitor content or as part of revenge attacks).

Model stealing techniques, which are used to “steal” (i.e., duplicate) models or recover training data membership via blackbox probing. This can be used, for example, to steal stock market prediction models and spam filtering models, in order to use them or be able to optimize more efficiently against such models.

The creation of adversarial samples often involves first building a ‘mask’ that can be applied to an existing input, such that it tricks the model into producing the wrong output. In the case of adversarially created image inputs, the images themselves appear unchanged to the human eye. .

Adversarial samples created in this way can even be used to fool a classifier when the image is printed out, and a photo is taken of the printout. Even simpler methods have been found to create adversarial images. 

The ease, and the number of ways in which adversarial samples designed to fool image recognition models can be created, illustrates that should these models be used to make important decisions (such as in content filtering systems), mitigations (described in the fourth article in this eries) should be carefully considered before production deployment.

An attacker submits adversarially altered pornographic ad banners to a popular, well-reputed ad provider service. The submitted images bypass their machine learning-based content filtering system. 

The pornographic ad banner is displayed on frequently visited high-profile websites. As a result, minors are exposed to images that would usually have been blocked by parental control software. This is an availability attack.

Researchers have recently demonstrated that adversarial samples can be crafted for areas other than image classification.

 In August 2018, a group of researchers at the Horst Görtz Institute for IT Security in Bochum, Germany, crafted psychoacoustic attacks against speech recognition systems, allowing them to hide voice commands in audio of birds chirping. The hidden commands were not perceivable to the human ear, so the audio tracks were perceived differently by humans and machine-learning-based systems.

An attacker embeds hidden voice commands into video content, uploads it to a popular video sharing service, and artificially promotes the video (using a Sybil attack).


Sybil Attack is a type of attack seen in peer-to-peer networks in which a node in the network operates multiple identities actively at the same time and undermines the authority/power in reputation systems.In a Sybil attack, the attacker subverts the reputation system of a peer-to-peer network by creating a large number of pseudonymous identities and uses them to gain a disproportionately large influence.


When the video is played on the victim’s system, the hidden voice commands successfully instruct a digital home assistant device to purchase a product without the owner knowing, instruct smart home appliances to alter settings (e.g. turn up the heat, turn off the lights, or unlock the front door), or to instruct a nearby computing device to perform searches for incriminating content (such as drugs or child pornography) without the owner’s knowledge (allowing the attacker to subsequently blackmail the victim). This is an availability attack.


An attacker forges a ‘leaked’ phone call depicting plausible scandalous interaction involving high-ranking politicians and business people. The forged audio contains embedded hidden voice commands. 

The message is broadcast during the evening news on national and international TV channels. The attacker gains the ability to issue voice commands to home assistants or other voice recognition control systems (such as Siri) on a potentially massive scale. This is an availability attack.


Availability attacks against natural language processing systems--


Natural language processing (NLP) models are used to parse and understand human language. Common uses of NLP include sentiment analysis, text summarization, question/answer systems, and the suggestions you might be familiar with in web search services.    

It is a piece of cake to use adversarial samples to fool natural language processing models by replacing words with synonyms  to bypass spam filtering, change the outcome of sentiment analysis, and fool a fake news detection model. 

English is a stupid language..

http://ajitvadakayil.blogspot.com/2010/11/sanskrit-digital-language-versus-versus.html


Scenario: evade fake news detection systems to alter political discourse

Fake news detection is a relatively difficult problem to solve with automation, and hence, fake news detection solutions are still in their infancy. As these techniques improve and people start to rely on verdicts from trusted fake news detection services, tricking such services infrequently, and at strategic moments would be an ideal way to inject false narratives into political or social discourse. In such a scenario, an attacker would create a fictional news article based on current events, and adversarially alter it to evade known respected fake news detection systems. The article would then find its way into social media, where it would likely spread virally before it can be manually fact-checked. This is an availability attack.

Scenario: trick automated trading algorithms that rely on sentiment analysis
Over an extended period of time, an attacker publishes and promotes a series of adversarially created social media messages designed to trick sentiment analysis classifiers used by automated trading algorithms. One or more high-profile trading algorithms trade incorrectly over the course of the attack, leading to losses for the parties involved, and a possible downturn in the market. This is an availability attack.


Availability attacks – reinforcement learning

Reinforcement learning is the process of training an agent to perform actions in an environment. Reinforcement learning models are commonly used by recommendation systems, self-driving vehicles, robotics, and games. Reinforcement learning models receive the current environment’s state (e.g. a screenshot of the game) as an input, and output an action (e.g. move joystick left). It is a piece of cake to use adversarial attacks to trick reinforcement learning models into performing incorrect actions. 


Two distinct types of attacks can be performed against reinforcement learning models.

A strategically timed attack modifies a single or small number of input states at a key moment, causing the agent to malfunction. For instance, in the game of pong, if a strategic attack is performed as the ball approaches the agent’s paddle, the agent will move its paddle in the wrong direction and miss the ball.

An enchanting attack modifies a number of input states in an attempt to “lure” the agent away from a goal. For instance, an enchanting attack against an agent playing Super Mario could lure the agent into running on the spot, or moving backwards instead of forwards.

Scenario: hijack autonomous military drones
By use of an adversarial attack against a reinforcement learning model, autonomous military drones are coerced into attacking a series of unintended targets, causing destruction of property, loss of life, and the escalation of a military conflict. This is an availability attack.

Scenario: hijack an autonomous delivery drone
By use of a strategically timed policy attack, an attacker fools an autonomous delivery drone to alter course and fly into traffic, fly through the window of a building, or land (such that the attacker can steal its cargo, and perhaps the drone itself). This is an availability attack.

The processes used to craft attacks against classifiers, NLP systems, and reinforcement learning agents are similar. As of writing, all attacks crafted in these domains have been purely academic in nature, and we have not read about or heard of any such attacks being used in the real world. 

However, tooling around these types of attacks is getting better, and easier to use. During the last few years, machine learning robustness toolkits have appeared on github. These toolkits are designed for developers to test their machine learning implementations against a variety of common adversarial attack techniques. 

IBM Adversarial Robustness Toolbox, developed by IBM, contains implementations of a wide variety of common evasion attacks and defence methods, and is freely available on github. Cleverhans, a tool developed by Ian Goodfellow and Nicolas Papernot, is a Python library to benchmark machine learning systems’ vulnerability to adversarial examples. It is also freely available on github.


Replication attacks: transferability attacks
Transferability attacks are used to create a copy of a machine learning model (a substitute model), thus allowing an attacker to “steal” the victim’s intellectual property, or craft attacks against the substitute model that work against the original model. Transferability attacks are straightforward to carry out, assuming the attacker has unlimited ability to query a target model.

In order to perform a transferability attack, a set of inputs are crafted, and fed into a target model. The model’s outputs are then recorded, and that combination of inputs and outputs are used to train a new model. It is worth noting that this attack will work, within reason, even if the substitute model is not of absolutely identical architecture to the target model.


It is possible to create a ‘self-learning’ attack to efficiently map the decision boundaries of a target model with relatively few queries. This works by using a machine learning model to craft samples that are fed as input to the target model. The target model’s outputs are then used to guide the training of the sample crafting model. As the process continues, the sample crafting model learns to generate samples that more accurately map the target model’s decision boundaries.


Confidentiality attacks: inference attacks
Inference attacks are designed to determine the data used during the training of a model. Some machine learning models are trained against confidential data such as medical records, purchasing history, or computer usage history. An adversary’s motive for performing an inference attack might be out of curiosity – to simply study the types of samples that were used to train a model – or malicious intent – to gather confidential data, for instance, for blackmail purposes.

A black box inference attack follows a two-stage process. The first stage is similar to the transferability attacks described earlier. The target model is iteratively queried with crafted input data, and all outputs are recorded. This recorded input/output data is then used to train a set of binary classifier ‘shadow’ models – one for each possible output class the target model can produce. For instance, an inference attack against an image classifier than can identify ten different types of images (cat, dog, bird, car, etc.) would create ten shadow models – one for cat, one for dog, one for bird, and so on. All inputs that resulted in the target model outputting “cat” would be used to train the “cat” shadow model, and all inputs that resulted in the target model outputting “dog” would be used to train the “dog” shadow model, etc.

The second stage uses the shadow models trained in the first step to create the final inference model. Each separate shadow model is fed a set of inputs consisting of a 50-50 mixture of samples that are known to trigger positive and negative outputs. The outputs produced by each shadow model are recorded. For instance, for the “cat” shadow model, half of the samples in this set would be inputs that the original target model classified as “cat”, and the other half would be a selection of inputs that the original target model did not classify as “cat”. 

All inputs and outputs from this process, across all shadow models, are then used to train a binary classifier that can identify whether a sample it is shown was “in” the original training set or “out” of it. So, for instance, the data we recorded while feeding the “cat” shadow model different inputs, would consist of inputs known to produce a “cat” verdict with the label “in”, and inputs known not to produce a “cat” verdict with the label “out”. A similar process is repeated for the “dog” shadow model, and so on. All of these inputs and outputs are used to train a single classifier that can determine whether an input was part of the original training set (“in”) or not (“out”).

This black box inference technique works very well against models generated by online machine-learning-as-a-service offerings, such as those available from Google and Amazon. Machine learning experts are in low supply and high demand. Many companies are unable to attract machine learning experts to their organizations, and many are unwilling to fund in-house teams with these skills. Such companies will turn to machine-learning-as-a-service’s simple turnkey solutions for their needs, likely without the knowledge that these systems are vulnerable to such attacks.


Poisoning attacks against anomaly detection systems

Anomaly detection algorithms are employed in areas such as credit card fraud prevention, network intrusion detection, spam filtering, medical diagnostics, and fault detection. Anomaly detection algorithms flag anomalies when they encounter data points occurring far enough away from the ‘centers of mass’ of clusters of points seen so far. These systems are retrained with newly collected data on a periodic basis. As time goes by, it can become too expensive to train models against all historical data, so a sliding window (based on sample count or date) may be used to select new training data.



Attacks against recommenders
Recommender systems are widely deployed by web services (e.g., YouTube, Amazon, and Google News) to recommend relevant items to users, such as products, videos, and news. Some examples of recommender systems include:----

YouTube recommendations that pop up after you watch a video
Amazon “people who bought this also bought…”
Twitter “you might also want to follow” recommendations that pop up when you engage with a tweet, perform a search, follow an account, etc.
Social media curated timelines
Netflix movie recommendations

App store purchase recommendations


Recommenders are implemented in various ways:---

Recommendation based on user similarity
This technique finds users most similar to a target user, based on items they’ve interacted with. They then predict the target user’s rating scores for other items based on the rating scores of those similar users. For instance, if user A and user B both interacted with item 1, and user B also interacted with item 2, recommend item 2 to user A.

Recommendation based on item similarity
This technique finds common interactions between items and then recommends a target user items based on those interactions. For instance, if many users have interacted with both items A and B, then if a target user interacts with item A, recommended B.

Recommendation based on both user and item similarity
These techniques use a combination of both user and item similarity-matching logic. This can be done in a variety of ways. For instance, rankings for items a target user has not interacted with yet are predicted via a ranking matrix generated from interactions between users and items that the target already interacted with.

An underlying mechanism in many recommendation systems is the co-visitation graph. It consists of a set of nodes and edges, where nodes represent items (products, videos, users, posts) and edge weights represent the number of times a combination of items were visited by the same user.
  
 The most widely used attacks against recommender systems are Sybil attacks (which are integrity attacks, see above). The attack process is simple – an adversary creates several fake users or accounts, and has them engage with items in patterns designed to change how that item is recommended to other users. Here, the term ‘engage’ is dependent on the system being attacked, and could include rating an item, reviewing a product, browsing a number of items, following a user, or liking a post. 

Attackers may probe the system using ‘throw-away’ accounts in order to determine underlying mechanisms, and to test detection capabilities. Once an understanding of the system’s underlying mechanisms has been acquired, the attacker can leverage that knowledge to perform efficient attacks on the system (for instance, based on knowledge of whether the system is using co-visitation graphs). Skilled attackers carefully automate their fake users to behave like normal users in order to avoid Sybil attack detection techniques.


Motives include:--

promotion attacks – trick a recommender system into promoting a product, piece of content, or user to as many people as possible

demotion attacks – cause a product, piece of content, or user to not be promoted as much as it should
social engineering – in theory, if an adversary already has knowledge on how a specific user has interacted with items in the system, an attack can be crafted to target that user with a recommendation such as a YouTube video, malicious app, or imposter account to follow.

Numerous attacks are already being performed against recommenders, search engines, and other similar online services. In fact, an entire industry exists to support these attacks. With a simple web search, it is possible to find inexpensive purchasable services to poison app store ratings, restaurant rating systems, and comments sections on websites and YouTube, inflate online polls, and engagement (and thus visibility) of content or accounts, and manipulate autocomplete and search results.

The prevalence and cost of such services indicates that they are probably widely used. Maintainers of social networks, e-commerce sites, crowd-sourced review sites, and search engines must be able to deal with the existence of these malicious services on a daily basis. Detecting attacks on this scale is non-trivial and takes more than rules, filters, and algorithms. Even though plenty of manual human labour goes into detecting and stopping these attacks, many of them go unnoticed.

From celebrities inflating their social media profiles by purchasing followers, to Cambridge Analytica’s reported involvement in meddling with several international elections, to a non-existent restaurant becoming London’s number one rated eatery on TripAdvisor, to coordinated review brigading ensuring that conspiratorial literature about vaccinations and cancer were highly recommended on Amazon , to a plethora of psy-ops attacks launched by the alt-right, high profile examples of attacks on social networks are becoming more prevalent, interesting, and perhaps disturbing. 

These attacks are often discovered long after the fact, when the damage is already done. Identifying even simple attacks while they are ongoing is extremely difficult, and there is no doubt many attacks are ongoing at this very moment.


Attacks against federated learning systems--
Federated learning is a machine learning setting where the goal is to train a high-quality centralized model based on models locally trained in a potentially large number of clients, thus, avoiding the need to transmit client data to the central location. A common application of federated learning is text prediction in mobile phones. 

Each phone contains a local machine learning model that learns from its user (for instance, which recommended word they clicked on). The phone transmits its learning (the phone’s model’s weights) to an aggregator system, and periodically receives a new model trained on the learning from all of the other phones participating.



Attacks against federated learning can be viewed as poisoning or supply chain (integrity) attacks. A number of Sybils, designed to poison the main model, are inserted into a federated learning network. These Sybils collude to transmit incorrectly trained model weights back to the aggregator which, in turn, pushes poisoned models back to the rest of the participants. 

For a federated text prediction system, a number of Sybils could be used to perform an attack that causes all participants’ phones to suggest incorrect words in certain situations. The ultimate solution to preventing attacks in federated learning environments is to find a concrete method of establishing and maintaining trust amongst the participants of the network, which is clearly very challenging.


The understanding of flaws and vulnerabilities inherent in the design and implementation of systems built on machine learning and the means to validate those systems and to mitigate attacks against them are still in their infancy, complicated – in comparison with traditional systems –  by the lack of explainability to the user, heavy dependence on training data, and oftentimes frequent model updating. This field is attracting the attention of researchers, and is likely to grow in the coming years. As understanding in this area improves, so too will the availability and ease-of-use of tools and services designed for attacking these systems.

As artificial-intelligence-powered systems become more prevalent, it is natural to assume that adversaries will learn how to attack them. Indeed, some machine-learning-based systems in the real world have been under attack for years already. As we witness today in conventional cyber security, complex attack methodologies and tools initially developed by highly resourced threat actors, such as nation states, eventually fall into the hands of criminal organizations and then common cyber criminals. 

This same trend can be expected for attacks developed against machine learning models.
Data Poisoning: Owing to the large volume of structured and unstructured data, BFSI companies become a prime target for cyber crooks to perpetrate data attacks. As deployment of AI-enabled models in financial services sees an uptick, there is a risk of manipulating the data used to train these models by hackers. Known as Data Poisoning, such an attack results in generation of erroneous output.



Adversarial AI: As organizations deploy intelligent systems, there are untrusted infrastructure tools such as open source data analysts and ML frameworks which can be compromised by criminals to extract data. Hackers use adversarial machine learning to detect patterns and identify vulnerabilities and fraud controls in the network. This enables then to commission malware in the systems which sits undetected on the network, and slowly exfiltrates confidential data passing through the system.

AI-systems are susceptible to adversarial attacks, and thus input sanitization should be on the security agenda of BFSI companies. IT systems should be trained to identify potential adversarial attacks by implementing a weaker version of the same such as distorted images. To prevent data leakage, security infrastructure should exhaustively cover all network endpoints. 

Humans are often the weakest link in security; business leaders should take conscious steps such as regular trainings and awareness initiatives to develop a common understanding of company’s security procedures among the employees. It is important to note that security is a system design, therefore security features should be baked in the design stages of an AI-based application and updated overtime to tackle the expanding threat landscape.

Trojans in Artificial Intelligence (TrojAI)
The U.S. Army Research Office, in partnership with the Intelligence Advanced Research Projects Activity, seeks research and development of technology and techniques for detection of Trojans in Artificial Intelligence. TrojAI is envisioned to be a two-year effort with multiple awardees coming together as a group of performers who will work together to achieve the common program goals set forth in the Broad Agency Announcement.



Current State
Using current machine learning methods, an Artificial Intelligence is trained on data, learns relationships in that data, and then is deployed to the world to operate on new data. For example, an AI can be trained on images of traffic signs, learn what stop signs and speed limit signs look like, and then be deployed as part of an autonomous car. The problem is that an adversary that can disrupt the training pipeline can insert Trojan behaviors into the AI. 

For example, an AI learning to distinguish traffic signs can be given potentially just a few additional examples of stop signs with yellow squares on them, each labeled “speed limit sign”. If the AI were deployed in a self-driving car, an adversary could cause the car to run through the stop sign just by putting a sticky note on it, since the AI would incorrectly see it as a speed limit sign. The goal of the TrojAI Program is to combat such Trojan attacks by inspecting AIs for Trojans.

Defending Against Trojan Attacks
Trojan attacks, also called backdoor or trapdoor attacks, involve modifying an AI to attend to a specific trigger in its inputs, which if present will cause the AI to give a specific incorrect response. In the traffic sign case, the trigger is a sticky note. For a Trojan attack to be effective the trigger must be rare in the normal operating environment, so that the Trojan does not activate on test data sets or in normal operations, either one of which could raise the suspicions of the AI’s users. 

Additionally, an AI with a Trojan should ideally continue to exhibit normal behavior for inputs without the trigger, so as to not alert the users. Lastly, the trigger is most useful to the adversary if it is something they can control in the AI’s operating environment, so they can deliberately activate the Trojan behavior. Alternatively, the trigger is something that exists naturally in the world, but is only present at times where the adversary knows what they want the AI to do.



Obvious defenses against Trojan attacks include securing the training data (to protect data from manipulation), cleaning the training data (to make sure the training data is accurate), and protecting the integrity of a trained model (prevent further malicious manipulation of a trained clean model). 

Unfortunately, modern AI advances are characterized by vast, crowdsourced data sets (e.g., 109 data points) that are impractical to clean or monitor. Additionally, many bespoke AIs are created by transfer learning: take an existing, public AI published online and modify it a little for the new use case. 

Trojans can persist in an AI even after such transfer learning. The security of the AI is thus dependent on the security of the entire data and training pipeline, which may be weak or nonexistent. Furthermore, the user may not be the one doing the training. Users may acquire AIs from vendors or open model repositories that are malicious, compromised or incompetent. 

Acquiring an AI from elsewhere brings all of the problems with the data pipeline, as well as the possibility of the AI being modified directly while stored at a vendor or in transit to the user. Given the diffuse and unmanageable supply chain security, the focus for the TrojAI Program is on the operational use case where the complete AI is already in the would-be users’ hands: detect if an AI has a Trojan, to determine if it can be safely deployed.

Evidently, the arm’s race between defenders and attackers favors the attackers. The rise of fake news and ‘data poisoning’ attacks aimed at machine learning inspired cyber threat intelligence systems is the result of a new  strategy adopted by attackers that adds complexity to an already complex and ever changing cyber threat and scape.

Attackers are now exploiting a vulnerability in the data training process of AI and ML inspired cyber threat  intelligence systems by injecting ‘poisoned data’ in training datasets to allow their malicious code to evade  detection. The ‘poisoned’ corpus is specifically tailored and targeted to AI and ML cyber threat intelligence  defense systems, especially those based on supervised and semi-supervised learning algorithms to make them  misclassify malicious code as legitimate data.


The input data itself is validated by using a mix of related indicators to determine its reliability. Based  on the validation of input data sources, the authors make an assumption that the corpus is trustworthy and then  add a security feature that prevents ‘data poisoning’ attacks. .. Our solution is based on working with trusted sources of input raw data. The dynamics of a solution  changes completely if the input raw data comes with ‘poisoned data’ that mimic trusted data. .


The arm’s race between cyberspace attackers and defender continues. Attackers’ tools, tactics and  procedures (TTPs) evolves so quickly that cyber defence, legislation and law enforcement lag behind.
On the one hand, new technology developments like cloud computing, social media, big data, machine  learning, internet of things and others are continually disrupting existing business models in a global  scale.

Hence, there is a mad rush to adopt new business models that open up new risks. It is no coincidence that today’s cyber threat landscape reflects that the attackers are gaining more grounds than the defenders. 

For example, attackers are very quick to adopt the latest technologies like artificial  intelligence (AI) and machine learning (ML) to detect and exploit defence systems’ vulnerabilities and,  evade detection asserts that attackers are using AI and ML to analyse  features of cyber threat intelligence systems on how they flag malware.


They then remove or conceal  through encryption the code snippet from their malware that could raise the red flags so that the classification algorithms cannot catch it.


The gap between attackers and defenders seems to be widening even more. This is no coincidence as  today’s cyber attackers are well funded and organized; they have vast resources at their disposal; operates in a well-structured, coordinated and highly incentivized underground economy

Today’s cyber attackers are patient and do their nefarious deeds with sophistication and targeting vulnerabilities in people, process and technology right across the globe with no respect for national boundaries. Attackers are now deploying advanced malware that leverages on cutting edge technology to not only circumvent advanced security defences but also to widen their scope and scale of their attacks


Attackers using autonomous and self-learning malware with catastrophic implications.  The anonymity or plausible deniability of cyber threats adds to the already complex threat landscape Cyber threats often emanate behind a veil of Internet anonymity that hides details of the attackers   

This anonymity is one of the biggest challenges of deterring any defence mechanism against or retaliating to cyber threats. Hence, the difficulty to determine who exactly is behind today’s cyber threats. This considerably challenges the field of cyberspace and has raised the intractable issue of  cybersecurity attribution. It must be noted that not knowing the enemy is one way to lose a battle

Chanakya’s theory onto the cyberspace battlefield and asserts that defenders who know their defence systems, the terrain of the cyberspace battlefield and  cybercriminals and their modus operandi have no reason to fear the result of a hundred cyberspace battles. However, it is also argued therein that defenders that know their defence systems and the  cyberspace battlefield’s terrain but not their enemy, for every victory gained, they also suffer a defeat.

This means that all efforts to understand the battlefield and own defence systems cannot guarantee  victory without an effort to understand the tactics of the adversary. It is also argued that defenders  who know neither their enemy nor their defence systems nor the terrain of the battlefield, will always  succumb in every battle that they engage in . 

Today’s threat landscape reflects that  cybersecurity defenders might know something about their defence systems, but they have a partial view of the cyberspace battlefield and have almost zero knowledge of the attackers and their modus  operandi.

Furthermore, a reactive approach to defending systems also adds to the already complex threat landscape. Most organisations only act after a breach has already occurred. In the ever changing cyber  threat landscape of rapid zero-days, advanced persistent threats (APTs), botnets, ransomware and state-sponsored espionage activities; a secure company today would be vulnerable by tomorrow.

A reactive approach to defence is totally insufficient to address today’s ever changing threats of fake news and ‘data poisoning’  Given the increasing use of predictive cybersecurity analytics and cyber threat intelligence platforms  which gives system defenders a capability to somehow anticipate signatures of new malware this is  inevitable. 

Attackers have since realised that new malware gets detected at first appearance by AI and  ML inspired malware classifiers and detectors in current cyber threat intelligence systems. Hence,  instead of concentrating their efforts on developing new malware, they are now investing their  resources into finding ways to breach AI and ML inspired cyber threat intelligence defences.


There is a huge need for proactive defence  efforts that make use of cyber threat intelligence systems. This is to help organizations build a better  situational awareness, recommend resilient cyber security controls, and learn from breaches in order  to adapt and re-shape existing controls to improve cyber threat detection and system resilience. 

Most  existing cyber threat intelligence (CTI) systems are leveraging on the current data-driven economy to  collect and collate massive cyber threat data from different source feeds. Attackers on the other hand  have since realized an opportunity to ‘poison’ cyber threat data sources to try and circumvent  detection systems.


Hence, it is  important that our solution make plausible means to validate, curate and secure input data to prevent  ‘data poisoning’... A robust cyber threat intelligence system can  provide highly technical metrics, countermeasures and corrective actions (.). 

The goal is to use  AI and ML algorithms in cyber threat intelligence platforms to transform threat data into actionable  intelligence to help thwart cyber-attacks. This has also caused attackers to focus on targeting datadriven cyber threat intelligence systems.

For example, some of the existing solutions are based on supervised learning algorithms that classify code as either clean (trustworthy) or malicious (deceptive) based  upon known signatures or patterns. Therefore, all that it takes for an attacker to foil such systems is to access the training data and tag malware as clean code. Such a scenario is possible, mainly because existing systems are concerned about the analysis of the data and have turned a blind eye to its protection.


At times attackers do not even have to corrupt the training  data sets, but alter the underlying algorithms that process the data…some existing threat intelligence systems are built based on ‘black box’ type  deep learning algorithms. This is to try and solve the problems of supervised learning algorithms.

However, with deep learning algorithms and the multiple layers in a neural network, it means that there is no way to know how system’ algorithms do their classification of threats in the different levels  of abstraction. This is a big cause for concern because it is not completely clear how the algorithms  make their decisions

Black box deep learning algorithms must be understandable to their creators and accountable to their users.

The layered approach  ensures that even if malware can circumvent one layer, it can still be detected by the other layers.

Within each layer, there is a number of individual ML models trained to recognize new and emerging  threats.

Their solution provides a robust multi-faceted view of new generation of threats that no single algorithm can achieve. The solution therein also uses stacked  ensemble models that take predictions from the base classifiers of ML models and combine them to create even stronger malware predictions.


Data protection is required to ensure that malicious entities cannot  tamper with the training and test data.

An attacker may use obfuscated attack samples in existing clusters  to prevent clustering algorithms from detecting their malicious code. This has raised serious concerns  around the integrity of AI and ML datasets and clustering algorithms thereof. .   cyber threat intelligence systems must adopt cybersecurity countermeasure that embrace the secure by design principle to effectively thwart ‘data poisoning‘ attacks (.) recognize the difficulty of identifying and separating ‘poisoned data’ samples from the legitimate corpus. 

Instead they choose to model their solution based on a small sample of training data, choosing only the points that are ‘close’ to legitimate points. Moreover, this work also made some interesting discoveries on the potential impact of ‘data poisoning’ in healthcare applications. For example, this study shows the effects that a small sample of ‘poisoned data’ points can have on the dosage of patients.

The results therein show that patient dosage can increase to an estimate of 350% . This basically mean that if patient health data is ‘poisoned’, patients would be made to take more than required drugs. This can potentially cause drug overdose and result in deaths. .. 

This is also based on supervised learning algorithms which basically mean that is suffers the same fate as other supervised learning algorithms, i.e. garbage-in, garbage-out. An attack to the clustering algorithms would reverse the results to make sources that are classified ‘not credible’ to be credible


Three ways that an attacker could use to evade detection by carefully adding moderate ‘poisoned data’ at different intervals. Though, this approach is  argued not to be worth it therein, but it performs well to throw off false-positives and negative-positive balance and hamper the efficacy of the detector 

The ANTIDOTE solution  therein was designed to prevent ‘data poisoning’ attacks from shifting false positives and false-negatives. Hence, the solution is argued to reject contaminated data. . on detecting ‘poisoned data’ samples using an anomaly detection system. The . al.’s methods work best if the chosen small training data sample is trustworthy. Should the contrary be true, both propositions fail dismally and can even result in ‘poisoned’ outlier detectors.


This would basically means that they will both work in reverse i.e. flagging malicious data as legitimate and legitimate data as malicious... The ‘black-box’ approach of unsupervised classifiers raises trust issues because it is not so clear how the algorithms make their decisions in classifying data points. Hence, the rise of research that attempts to solve the serious ‘data poisoning’ issues of AI and ML based cyber threat intelligence systems. .‘poisoned data’ which stands to poison even the data classifiers. 

Data cleansing of the corpus includes removing duplicate entries and null values. This exercise does not compromise the completeness and/or integrity of the data as obtained from the original source.
However, the removal of duplicate entries ensures that there is only one record for each and every  entry. This also helps in reducing the size of the corpus, more especially for storage purposes. 

DHCP protocol that changes IP addresses every time a machine is rebooted. .. Should an attacker attempt to inject ‘poisoned data’ the hash value of the changed data object would reflect that the data has been changed. This action raises a flag and sends the administrator and alert. Once the data objects have  been hashed and indexed, they are then encrypted with AES-124 for secure storage. The encryption is per data object as compared to the entire corpus. 

The security on the crypto comes with a performance cost in that each of the data object has to be individually decrypted before the data can be processed. The encryption process is not necessarily a problem because by the time the data is put into the database, there is not real-time requirement for it to be processed. So, the encryption is not necessarily time-bound as is the case with the decryption process. The decryption performance cost is balanced by the hash-based quick retrieval of data objects.


The hashed, indexed and encrypted data objects are stored in an encrypted database. This just adds another layer of security to prevent unauthorized access to the database. So an attacker would have to go through the database encryption before they can get to the encrypted data objects. So it takes  to layers to get to the plain-text data. The system uses a need-to-know principle to restrict access to  database. 

Hence, database access is restricted to the module of the system that does the processing  and analysing of the data and the administrators only. This also has the pipelining feature to monitor incomplete processes and alert the administrators in case of incomplete processes.. , the data includes IP address, reliability score, a unique object ID, date of verification and others. The contents of each of the data objects vary depending on the data source..

Attackers are now exploiting a vulnerability in the data training process of AI and ML inspired cyber  threat intelligence systems to allow their malicious code to evade detection.

One of the major safety strategies that has arisen from this research is an approach called model hardening, which has advanced techniques that combat adversarial attacks by  strengthening the architectural components of the systems. Model hardening techniques  may include adversarial training, where training data is methodically enlarged to include  adversarial examples.

Other model hardening methods involve architectural modification, regularisation, and data pre-processing manipulation. A second notable safety strategy is runtime detection, where the system is augmented with a discovery apparatus that can identify  and trace in real-time the existence of adversarial examples. 

You should consult with  members of your technical team to ensure that the risks of adversarial attack have been  taken into account and mitigated throughout the AI lifecycle. This  threat to safe and reliable AI involves a malicious compromise of data sources at the point of  collection and pre-processing. 

Data poisoning occurs when an adversary modifies or  manipulates part of the dataset upon which a model will be trained, validated, and tested. By altering a selected subset of training inputs, a poisoning attack can induce a trained AI system  into curated misclassification, systemic malfunction, and poor performance. 

An especially  concerning dimension of targeted data poisoning is that an adversary may introduce a  ‘backdoor’ into the infected model whereby the trained system functions normally until it  processes maliciously selected inputs that trigger error or failure.

In order to combat data poisoning, your technical team should become familiar with the  state of the art in filtering and detecting poisoned data. However, such technical solutions  are not enough. Data poisoning is possible because data collection and procurement often  involves potentially unreliable or questionable sources. 

When data originates in  uncontrollable environments like the internet, social media, or the Internet of Things, many  opportunities present themselves to ill-intentioned attackers, who aim to manipulate training  examples. Likewise, in third- party data curation processes (such as ‘crowdsourced’ labelling, annotation, and content identification), attackers may simply handcraft malicious inputs.

Data poisoning is a field of  cybersecurity. The goal of such attacks is to pervert a learning system by manipulating a small subset  of the training data. Data poisoning could be a serious threat to deep learning-based malware detection.


Indeed, the system will be trained on a large and uncontrolled set of software produced by other  people (some of who may be hackers). Thus, it will be important not to forget that one could have  introduced a few data-poisoned samples to the training data.

 In particular, training processes for deep  learning do not form hypotheses about the integrity of the training data. In addition,  shows that deep learning is sensitive to such attacks: on MNIST, reports the ability to make the error jump  from 1.3% to 2% and 4% just by manipulating 3% and 6% of the training dataset.

So, deep networks are sensitive both to adversarial examples and data poisoning data poisoning  that is invisible to human eyes can be generated by adding adversarial noise to the training data. 

The most classical data poisoning task consists of manipulating freely a small subset of the training  data; as such, when trained on these data, the targeted system will have a low testing accuracy Adversarial attacks are very impressive, especially in computer vision where small perturbations  of images are usually undetected to human eyes.

Concern can be broken down to specific areas:---

The ability of hackers to steal data used to train the algorithms.
The manipulation of data to provide incorrect results.
The use of AI to impersonate authorized users to access a network.
The ability of AI to automate cyberattacks.
If you think about the ability to reverse engineer an algorithm … you’re effectively stealing that application and you’re displacing the competitive advantage that company [that developed it] may have in the marketplace



Augmented intelligence is an alternative conceptualization of artificial intelligence that focuses on AI's assistive role, emphasizing the fact that cognitive technology is designed to enhance human intelligence rather than replace it.. The choice of the word augmented, which means "to improve," reinforces the role human intelligence plays when using machine learning and deep learning algorithms to discover relationships and solve problems. 

While a sophisticated AI program is certainly capable of making a decision after analyzing patterns in large data sets, that decision is only as good as the data that human beings gave the programming to use.

Deep Learning goes much further and attempts to analyze the nature of the phenomena that the data represents, including discovery of rules of behavior, interactions, and strategy.

Artificial intelligence and machine learning are often used in lieu of each other. However, they mean different things altogether, with machine learning algorithms simply being a subset of AI where the algorithms can undergo improvement after being deployed. This is known as self-improvement and is one of the most important parts of creating AI of the future.

Computer vision relies on pattern recognition and deep learning to recognize what’s in a picture or video. When machines can process, analyze and understand images, they can capture images or videos in real time and interpret their surroundings.

The financial technology sector has already started using AI to save time, reduce costs, and add value. Deep learning is changing the lending industry by using more robust credit scoring. Credit decision-makers can use AI for robust credit lending applications to achieve faster, more accurate risk assessment, using machine intelligence to factor in the character and capacity of applicants.

Deep-learning methods required thousands of observation for models to become relatively good at classification tasks and, in some cases, millions for them to perform at the level of humans. Deep learning is famous in giant tech companies; they are using big data to accumulate petabytes of data. It allows them to create an impressive and highly accurate deep learning model.

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used

As impressive as NAS algorithms are, they come with a caveat. Their main limiting factor is computational power. The more you have, the faster the controller algorithm would iterate through architectures before reaching the best fit solution. And since most machine learning problems require a fair bit of ‘searching,’ using NAS turns into an arms race for computing power.

Thus, being resource-heavy makes the NAS algorithms relatively inaccessible to everyone, but a few technological giants with vast access to computing power and data.

However, a project called ScyNet is going down a path that could substantially lower the barrier to the access of NAS algorithms. To do that, we create a decentralized open network where people are rewarded for sharing computational resources and data.

.Data is all around us. The Internet of Things (IoT) and sensors have the ability to harness large volumes of data, while artificial intelligence (AI) can learn patterns in the data to automate tasks for a variety of business benefits.
.
Artificial Intelligence enhances the speed, precision and effectiveness of human efforts. In financial institutions, AI techniques can be used to identify which transactions are likely to be fraudulent, adopt fast and accurate credit scoring, as well as automate manually intense data management tasks.
Not all features are meaningful for the algorithm. A crucial part of machine learning is to find a relevant set of features to make the system learns something.

One way to perform this part in machine learning is to use feature extraction. Feature extraction combines existing features to create a more relevant set of features. It can be done with PCA, T-SNE or any other dimensionality reduction algorithms.

For example, an image processing, the practitioner needs to extract the feature manually in the image like the eyes, the nose, lips and so on. Those extracted features are feed to the classification model.
When there is enough data to train on, deep learning achieves impressive results, especially for image recognition and text translation. The main reason is the feature extraction is done automatically in the different layers of the network.

One of the earliest examples of AI in security was in the filtering of spam email. Instead of applying crude filters to email, probabilities were applied using Bayesian filters. In this approach, the user would identify whether an email was spam. 

The algorithm would adjust the priorities of all of the words based on whether it was spam. In the context of malware classification, deep learning has been found effective in this domain. In this context, an executable application consists of a series of bytes that have a defined structure along with numeric instructions that are run on the given processor architecture. 

Rather than use the executable’s numerical instructions directly, these instructions are encoded using an embedded encoding. This entails taking an instruction numeric and translating it into a higher dimensional space (similar to the way words are encoded into vectors with word2vec for use by deep learning). The embedded encodings can then be applied to the deep neural network (DNN) through the convolutional and pooling layers to yield a classification. 

The DNN was trained on a set of executables that represented normal programs and those that were malware. The DNN could successfully isolate the features that make up a malware program versus a typical program. Fireeye demonstrated this approach with upwards of 96-percent accuracy in detecting malware in the wild. .With deep learning, algorithms can operate on relatively raw data and extract features without human intervention. 

Deep learning methods significantly improve detection of threats. Development and productization of deep learning systems for cyber defense require large volumes of data, computations, resources, and engineering effort. Stronger detection of malicious PowerShell scripts and other threats on endpoints using deep learning mean richer and better-informed security

Deep learning is successful in a wide range of tasks but no one knows exactly why it works and when it fails.Because deep learning models will be used more and more in the future, we should not put blind faith in those that are not well understood and that can put lives at risk.

Very powerful AI tools can fail badly in seemingly trivial scenarios. They developed a method to analyze these scenarios and establish "risk" maps that identify these risks in real applications.

AI models are commonly tested for robustness using pixel perturbations, that is, adding noise to an image to try to fool the deep-trained network,   However, artificial pixel perturbations are actually unlikely to occur in real-life applications . It is much more likely that semantic or contextual scenarios occur that the networks have not been trained on.

Algorithmic models can be punitive, discriminatory and, in some instances, they can even be illegal. Many add an extra layer of harm to already vulnerable population groups. Our own values and desires influence our choices: from the data we choose to collect, to the questions we ask. Models are opinions embedded in mathematics.

For example, rather than being faced with a manipulated image of an object, an autonomous driving system is more likely to be faced with an object orientation or a lighting scenario that it has not learned or encountered before; as a result, the system may not recognize what could be another vehicle or nearby pedestrian. This is the type of failure that has occurred in a number of high-profile incidents involving self-driving vehicles.

Deep learning is a class of machine learning algorithms that use multiple layers to progressively extract higher level features from raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify human-meaningful items such as digits or letters or faces.

With machine learning, you need fewer data to train the algorithm than deep learning. Deep learning requires an extensive and diverse set of data to identify the underlying structure. Besides, machine learning provides a faster-trained model. Most advanced deep learning architecture can take days to a week to train. The advantage of deep learning over machine learning is it is highly accurate. You do not need to understand what features are the best representation of the data; the neural network learned how to select critical features. In machine learning, you need to choose for yourself what features to include in the model.

When there is enough data to train on, deep learning achieves impressive results, especially for image recognition and text translation. The main reason is the feature extraction is done automatically in the different layers of the network.

ML/DL are creating significant impact on our lives. They have vastly improved the recognition techniques e.g. face, speech, handwriting etc. They help online companies recommend product to users, understand the responses, filter the emails etc. They help predict outages in IT systems and telecom networks, diseases in patients etc. They are replacing analysts in predicting financial market movements, asset allocations, preparing tax returns etc. 

Deep Learning is particular has changed the course of Natural Language Processing (NLP) as computers understand, generate and translate written text as well as speech. The rise of autonomous robots and vehicles is directly linked to advances in Artificial Intelligence. Affective computing focuses on systems which can even decipher emotions. These computers would use facial expressions, postures, gestures, speech, temperature of body etc to understand emotional state of user and adapt its response accordingly. The list is endless.

Data Efficient Learning - is the ability to learn complex task domains without requiring massive amounts of data. Although supervised deep learning can address the problem of learning from larger datasets, for many real-world examples the amount of available training data is not sufficient to utilize such systems

In deep learning, data is typically pushed through a "deep" stack of activation layers. Each layer builds a representation of the data, with subsequent layers using the features of the previous layer to build more complex representations. The output of the final layer is mapped to a category to which the data should belong. Getting this final mapping correct is the objective of a deep learning algorithm.

At first, the computer program is provided with training data, for example images that have been labeled with meta tags. The algorithm uses this information to build a progressively more accurate predictive capability. In contrast, shallow machine learning approaches rely on a substantial amount of feature engineering processes carried out by humans before a model can learn the relationship between features; in deep learning, however, the system acquires the features and their relationships simultaneously.


Evolutionary algorithms (EA) are a subset of evolutionary computation—algorithms that mimic biological evolution to solve complex problems.




The amount of data we generate every day is staggering—currently estimated at 2.7 quintillion bytes—and it’s the resource that makes deep learning possible. Since deep-learning algorithms require a ton of data to learn from, this increase in data creation is one reason that deep learning capabilities have grown in recent years. 

In addition to more data creation, deep learning algorithms benefit from the stronger computing power that’s available today as well as the proliferation of Artificial Intelligence (AI) as a Service. AI as a Service has given smaller organizations access to artificial intelligence technology and specifically the AI algorithms required for deep learning without a large initial investment. 

Deep learning allows machines to solve complex problems even when using a data set that is very diverse, unstructured and inter-connected. The more deep learning algorithms learn, the better they perform.

8 practical examples of deep learning--

Now that we’re in a time when machines can learn to solve complex problems without human intervention, what exactly are the problems they are tackling? Here are just a few of the tasks that deep learning supports today and the list will just continue to grow as the algorithms continue to learn via the infusion of data.

      Virtual assistants
Whether it’s Alexa or Siri or Cortana, the virtual assistants of online service providers use deep learning to help understand your speech and the language humans use when they interact with them.

      Translations
In a similar way, deep learning algorithms can automatically translate between languages. This can be powerful for travelers, business people and those in government.

      Vision for driverless delivery trucks, drones and autonomous cars
The way an autonomous vehicle understands the realities of the road and how to respond to them whether it’s a stop sign, a ball in the street or another vehicle is through deep learning algorithms. The more data the algorithms receive, the better they are able to act human-like in their information processing—knowing a stop sign covered with snow is still a stop sign.

      Chatbots and service bots
Chatbots and service bots that provide customer service for a lot of companies are able to respond in an intelligent and helpful way to an increasing amount of auditory and text questions thanks to deep learning.

      Image colorization
Transforming black-and-white images into color was formerly a task done meticulously by human hand. Today, deep learning algorithms are able to use the context and objects in the images to color them to basically recreate the black-and-white image in color. The results are impressive and accurate.

      Facial recognition
Deep learning is being used for facial recognition not only for security purposes but for tagging people on Facebook posts and we might be able to pay for items in a store just by using our faces in the near future. The challenges for deep-learning algorithms for facial recognition is knowing it’s the same person even when they have changed hairstyles, grown or shaved off a beard or if the image taken is poor due to bad lighting or an obstruction.

      Medicine and pharmaceuticals
From disease and tumor diagnoses to personalized medicines created specifically for an individual’s genome, deep learning in the medical field has the attention of many of the largest pharmaceutical and medical companies.

      Personalized shopping and entertainment
Ever wonder how Netflix comes up with suggestions for what you should watch next? Or where Amazon comes up with ideas for what you should buy next and those suggestions are exactly what you need but just never knew it before? Yep, it’s deep-learning algorithms at work.

The more experience deep-learning algorithms get, the better they become. It should be an extraordinary few years as the technology continues to mature.

Though based loosely on the way neurons communicate in the brain, these “deep learning” systems remain incapable of many basic functions that would be essential for primates and other organisms. In artificial neural networks, “catastrophic forgetting” refers to the difficulty in teaching the system to perform new skills without losing previously learned functions. 

For example, if a network initially trained to distinguish between photos of dogs and cats is then re-trained to distinguish between dogs and horses, it will lose its earlier ability. By contrast, the brain is capable of “continual learning,” acquiring new knowledge without eliminating old memories, even when the same neurons are used for multiple tasks. 

One strategy the brain uses for this learning challenge is the selective activation of cells or cellular components for different tasks—essentially turning on smaller, overlapping sub-networks for each individual skill, or under different contexts.

Deep learning techniques are based on artificial neural networks arranged in different layers, each of which calculates the values for the next one so that the information is processed more and more completely.

Usually, a set of known answers to the problem is used to “train” the network, but when these are not known, another technique called “reinforcement learning” can be used.

In this approach two neural networks are used: an “actor” has the task of finding new solutions, and a “critic” must assess the quality of these solution. Provided a reliable way to judge the respective results can be given by the researchers, these two networks can examine the problem independently.

Reinforcement learning differs from the supervised learning in a way that in supervised learning the training data has the answer key with it so the model is trained with the correct answer itself whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do to perform the given task. In the absence of training dataset, it is bound to learn from its experience.
Main points in Reinforcement learning –

Input: The input should be an initial state from which the model will start
Output: There are many possible output as there are variety of solution to a particular problem
Training: The training is based upon the input, The model will return a state and the user will decide to reward or punish the model based on its output.
The model keeps continues to learn.
The best solution is decided based on the maximum reward.

Types of Reinforcement: There are two types of Reinforcement:

Positive –
Positive Reinforcement is defined as when an event, occurs due to a particular behavior, increases the strength and the frequency of the behavior. In other words it has a positive effect on the behavior.
Advantages of reinforcement learning are:

Maximizes Performance
Sustain Change for a long period of time
Disadvantages of reinforcement learning:

Too much Reinforcement can lead to overload of states which can diminish the results
Negative –
Negative Reinforcement is defined as strengthening of a behavior because a negative condition is stopped or avoided.
Advantages of reinforcement learning:

Increases Behavior
Provide defiance to minimum standard of performance
Disadvantages of reinforcement learning:

It Only provides enough to meet up the minimum behavior
Various Practical applications of Reinforcement Learning –

RL can be used in robotics for industrial automation.
RL can be used in machine learning and data processing
RL can be used to create training systems that provide custom instruction and materials according to the requirement of students.
RL can be used in large environments in the following situations:

A model of the environment is known, but an analytic solution is not available;
Only a simulation model of the environment is given (the subject of simulation-based optimization)

The only way to collect information about the environment is to interact with it.



Constraint programming, differential programming (i.e. Deep Learning) and generative programming share a common trait. The program or algorithm that discovers the solution is fixed. In other words, a programmer does not need to write the program that translates the specification into a solution. Unfortunately though, the fixed program is applicable only in narrow domains

This is known as the “No Free Lunch” theorem in machine learning. You can’t use a linear programming algorithm to solve an integer programming problem. Deep Learning however has a unique kind of general capability that the same kind of algorithm (i.e. stochastic gradient descent) appears to be applicable to many problems.

While some machine learning models – like decision trees – are transparent, the majority of models used today – like deep neural networks, random forests, gradient boosting machines, and ensemble models – are black-box models. 

Large and complex models can be hard to explain, in human terms. For instance, why a particular decision was obtained. It is one reason that acceptance of some AI tools are slow in application areas where interpretability is useful or indeed required.

Furthermore, as the application of AI expands, regulatory requirements could also drive the need for more explainable AI models.

There will be no runaway AIs, there will be no self-developing AIs out of our  control. There will be no singularities. AI will only be as intelligent as we encourage (or force) it to be, under  uress.
 But AIs will never have human intelligence.  Amore advanced AI will fit us so closely that it will become integrated within us and our  societies.

Today all AIs are extremely limited in their intelligences that we cannot create general purpose intelligences using a single approach. There is no single AI on the planet (not even the fashionable “Deep Learning”) that can use the same method  to process speech, drive a car, learn how to play a complex video game, control a robot to run along a  busy city street, wash dishes in a sink, and plan a strategy to achieve investment for a company.

The military is also developing and testing many other kinds of autonomous aircraft and ground vehicles.

However, there is still some doubt that soldiers would take some time in feeling comfortable inside a robotic tank.

Especially if that tank does not explain its decisions to soldiers.


It is also true that intelligence analysts would show some reluctance in acting on information that does not come with proper reasoning.

Existing machine learning computer systems  produce a good amount of false alarms.

Because of that, an intelligence analyst would really require a good bit of help in order to understand why the new machine learning system made a recommendation that it made.

Artificial intelligence community still had a long way to go if it truly wanted to have interpretable artificial intelligence..

Knowing the reasoning behind artificial intelligence’s decisions is going to become crucial if this type of technology evolves to something very common and very useful part of the people’s daily lives.

Deep learning is fundamentally blind to cause and effect. Unlike a real doctor, a deep learning algorithm cannot explain why a particular image may suggest disease. This means deep learning must be used cautiously in critical situations.

A robot that understands that dropping things causes them to break would not need to toss dozens of vases onto the floor to see what happens to them.

Humans don't need to live through many examples of accidents to drive prudently,They can just imagine accidents

But deep learning algorithms aren’t good at generalizing, or taking what they’ve learned from one context and applying it to another. They also capture phenomena that are correlated—like the rooster crowing and the sun coming up—without regard to which causes the other
.
The “black box” complexity of deep learning techniques creates the challenge of “explainability,” or showing which factors led to a decision or prediction, and how. This is particularly important in applications where trust matters and predictions carry societal implications, as in criminal justice applications or financial lending. Some nascent approaches, including local interpretable model-agnostic explanations (LIME), aim to increase model transparency.

The types of AI being deployed are still limited. Almost all of AI’s recent progress is through one type, in which some input data X is used to generate some output response Y—where the algorithms identify complex input and output relationships. 

The most common deep learning networks (containing millions of simulated “neurons" structured in layers) are convolutional neural networks (CNNs) and recurrent neural networks (RNNs).


Then there are combinations of these networks like the Generative adversarial networks where two networks compete against each other and square off to improve their understanding. The X,Y systems have been improving rapidly with these neural networks. 

There will be breakthroughs that make higher levels of intelligence possible but currently it is far from the science fiction AI and fall short of answering queries 

Often, artificial intelligence (AI) applications employ neural networks that produce results using algorithms with a complexity level that only computers can make sense of. In other instances, AI vendors will not reveal how their AI works. In either case, when conventional AI produces a decision, human end users don’t know how it arrived at its conclusions.

This black box can pose a significant obstacle. Even though a computer is processing the information, and the computer is making a recommendation, the computer does not have the final say. That responsibility falls on a human decision maker, and this person is held responsible for any negative consequences.

As AI applications have expanded, machines are being tasked with making decisions where millions of dollars -- or even human health and safety -- are on the line. In highly regulated, high-risk/high-value industries, there's simply too much at stake to trust the decisions of a machine at face value, with no understanding of a machine’s reasoning or the potential risks associated with a machine’s recommendations. These enterprises are increasingly demanding explainable AI (XAI)

Many of the algorithms used for machine learning are not able to be examined after the fact to understand specifically how and why a decision has been made. This is especially true of the most popular algorithms currently in use – specifically, deep learning neural network approaches. 

As humans, we must be able to fully understand how decisions are being made so that we can trust the decisions of AI systems. The lack of explainability and trust hampers our ability to fully trust AI systems. We want computer systems to work as expected and produce transparent explanations and reasons for decisions they make. This is known as Explainable AI (XAI).



One way to gain explainability in AI systems is to use machine learning algorithms that are inherently explainable. For example, simpler forms of machine learning such as decision trees, Bayesian classifiers, and other algorithms that have certain amounts of traceability and transparency in their decision making can provide the visibility needed for critical AI systems without sacrificing too much performance or accuracy. 

More complicated, but also potentially more powerful algorithms such as neural networks, ensemble methods including random forests, and other similar algorithms sacrifice transparency and explainability for power, performance, and accuracy.

Traceability will enable humans to get into AI decision loops and have the ability to stop or control its tasks whenever need arises. An AI system is not only expected to perform a certain task or impose decisions but also have a model with the ability to give a transparent report of why it took specific conclusions.

Local Interpretable Model-Agnostic Explanations LIME is an actual method developed to gain greater transparency on what’s happening inside an algorithm.


LIME incorporates interpretability both in the optimization and the notion of interpretable representation, such that domain and task specific interpretability criteria can be accommodated.




Cognitive Computing is the individual technologies that perform specific tasks that facilitate human intelligence. These are smart decision support systems that we have been working with since the beginning of the internet boom.

With recent breakthroughs in technology, these decision support systems simply use better data, better algorithms to come up with a better analysis of vast stores of information.

Therefore, cognitive computing refers to:--
understanding and simulating reasoning
understanding and simulating human behavior


Using cognitive computing systems, every day, we make better human decisions at work.



Cognitive, bio-inspired AI solutions that employ human-like reasoning and problem-solving let users look inside the black box. In contrast to conventional AI approaches, cognitive AI solutions pursue knowledge using symbolic logic on top of numerical data processing techniques like machine learning, neural networks and deep learning.

The neural networks employed by conventional AI must be trained on data, but they don’t have to understand it the way humans do. They “see” data as a series of numbers, label those numbers based on how they were trained and solve problems using pattern recognition. When presented with data, a neural net asks itself if it has seen it before and, if so, how it was labeled it previously.

In contrast, cognitive AI is based on concepts. A concept can be described at the strict relational level, or natural language components can be added that allow the AI to explain itself. A cognitive AI says to itself: “I have been educated to understand this kind of problem. You're presenting me with a set of features, so I need to manipulate those features relative to my education.”

Cognitive systems do not do away with neural nets, but they do interpret the outputs of neural nets and provide a narrative annotation. Decisions made by cognitive AI are delivered in clear audit trails that can be understood by humans and queried for more detail. These audit trails explain the reasoning behind the AI’s recommendations, along with the evidence, risk, confidence and uncertainty.

A robust cognitive AI system can automatically adjust the depth and detail of its explanations based on who is viewing the information and on the context of how the information will be used.

In most cases, the easiest way for humans to visualize decision processes is by the use of decision trees, with the top of the tree containing the least amount of information and the bottom containing the most. With this in mind, explainability can generally be categorized as either top-down or bottom-up.

The top-down approach is for end users who are not interested in the nitty-gritty details; they just want to know if an answer is correct or not. A cognitive AI might generate a prediction of what the equipment will produce in its current condition. 

More technical users can then look at the detail, determine the cause of the issue and then hand it off to engineers to fix. The bottom-up approach is useful to the engineers who must diagnose and fix the problem. These users can query the cognitive AI to go all the way to the bottom of the decision tree and look at the details that explain the AI’s conclusion at the top.

Explainable AI begins with people. AI engineers can work with subject matter experts and learn about their domains, studying their work from an algorithm/process/detective perspective. What the engineers learn is encoded into a knowledge base that enables the cognitive AI to verify its recommendations and explain its reasoning in a way that humans can understand.

A cognitive AI is future-proof. Although governments have been slow to regulate AI, legislatures are catching up. The European Union’s General Data Protection Regulation (GDPR), a data governance and privacy law that went into effect this past May, grants consumers the right to know when automated decisions are being made about them, the right to have these decisions explained and the right to opt out of automated decision-making completely. Enterprises that adopt XAI now will be prepared for future compliance mandates.


AI is not supposed to replace human decision making; it is supposed to help humans make better decisions. If people do not trust the decision-making capabilities of an AI system, these systems will never achieve wide adoption. For humans to trust AI, systems must not lock all of their secrets inside a black box. XAI provides that explanation.

Cognitive Computing focuses on mimicking human behavior and reasoning to solve complex problems.

Cognitive Computing tries to replicate how humans would solve problems while AI seeks to create new ways to solve problems that can potentially be better than humans.

AI is not intended to mimic human thoughts and processes but to solve a problem through the best possible algorithm.

Cognitive Computing is not responsible for making the decision for humans. They simply supplement information for humans to make decisions.


AI is responsible for making decisions on their own minimizing the role of humans.














THIS POST IS NOW CONTINUED TO PART 7 BELOW--





CAPT AJIT VADAKAYIL
..


Viewing all articles
Browse latest Browse all 852

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>