The Digital Giants Stir…

On the day local Facebook news-feeds went dark, we look at the issues behind the ban, and consider the recent data on the rise of digital media. Has the Government backed local media barons against the future? And what are the potential implications?

Sure local ad funding for local media has fallen, but we think the genie is out of the bottle, with good reason.

Go to the Walk The World Universe at https://walktheworld.com.au/

Facebook Transcribed Users’ Audio Chats

Via Bloomberg.

Facebook Inc. has been paying hundreds of outside contractors to transcribe clips of audio from users of its services, according to people with knowledge of the work.

The work has rattled the contract employees, who are not told where the audio was recorded or how it was obtained — only to transcribe it, said the people, who requested anonymity for fear of losing their jobs. They’re hearing Facebook users’ conversations, sometimes with vulgar content, but do not know why Facebook needs them transcribed, the people said.

Facebook confirmed that it had been transcribing users’ audio and said it will no longer do so, following scrutiny into other companies. “Much like Apple and Google, we paused human review of audio more than a week ago,” the company said Tuesday. The company said the users who were affected chose the option in Facebook’s Messenger app to have their voice chats transcribed. The contractors were checking whether Facebook’s artificial intelligence correctly interpreted the messages, which were anonymized.

The social networking giant, which just completed a $5 billion settlement with the U.S. Federal Trade Commission after a probe of its privacy practices, has long denied that it collects audio from users to inform ads or help determine what people see in their news feeds. Chief Executive Officer Mark Zuckerberg denied the idea directly in Congressional testimony

Facebook’s Libra Could Be Revolutionary

Facebook released their while paper , and it poses a threat to current payment systems. The 29-page paper describes a protocol designed to evolve as it powers a new global currency.

“The Libra Blockchain is a decentralized, programmable database designed to support a low-volatility cryptocurrency that will have the ability to serve as an efficient medium of exchange for billions of people around the world.”

As Libra is a stablecoin, it will have less volatility than a crypto like Bitcoin as it’s tied to the value of real-world currencies. It’s potential is huge.

All over the world, people with less money pay more for financial services. Hard-earned income is eroded by fees, from remittances and wire costs to overdraft and ATM charges. Payday loans can charge annualized interest rates of 400 percent or more, and finance charges can be as high as $30 just to borrow $100.4 When people are asked why they remain on the fringe of the existing financial system, those who remain “unbanked” point to not having sufficient funds, high and unpredictable fees, banks being too far away, and lacking the necessary documentation.

Behind it, is the Libra Association, which is an independent, not-for-profit membership organization based in Geneva, Switzerland. “Members of the Libra Association will consist of geographically distributed and diverse businesses, nonprofit and multilateral organizations, and academic institutions.”

Founding members include:

  • Payments: Mastercard, PayPal, PayU (Naspers’ fintech arm), Stripe, Visa
  • Technology and marketplaces: Booking Holdings, eBay, Facebook/Calibra, Farfetch, Lyft, MercadoPago, Spotify AB, Uber Technologies, Inc.
  • Telecommunications: Iliad, Vodafone Group
  • Blockchain: Anchorage, Bison Trails, Coinbase, Inc., Xapo Holdings Limited
  • Venture Capital: Andreessen Horowitz, Breakthrough Initiatives, Ribbit Capital, Thrive Capital, Union Square Ventures
  • Nonprofit and multilateral organizations, and academic institutions: Creative Destruction Lab, Kiva, Mercy Corps, Women’s World Banking

The reaction from the fintech industry has been positive, though others are concerned about privacy, and risk from Facebook’s existing reach, and are calling for an inquiry into the proposal before it continues. Which ever way you look at it – this is big news.

The fintech industry has expressed excitement over the announcement by Facebook that it was set to introduce a new cryptocurrency to market with the help of some of the biggest names in tech, via InvestorDaily.

The cryptocurrency, dubbed Libra, has been announced by a Facebook white paper stating their mission to empower billions worldwide to enter the financial market.

“The mission for Libra is a simple global currency and financial infrastructure that empowers billions of people,” said the white paper. 

The move has been met with excitement by industry players, and general manager of FinTech Australia Rebecca Schot-Guppy said such a rollout would open up new markets and promote fintech innovation. 

“Another exciting prospect out of this is that Facebook’s reach may also help finally educate the public on the power of blockchain and cryptocurrency. Calibra [digital wallet for Libra] could take these technologies mainstream and put them at the fingertips of every Australian,” she said. 

Co-founder and co-chief executive of Assembly Payments Simon Lee said it seemed like Facebook’s attempting to copy what WeChat and Alipay had done in China. 

“We see Calibra as Facebook’s attempt to roll out what is happening in China to the rest of the world. They’ve seen the opportunity and have the scale to execute on it,” he said. 

The currency will be built on the Libra blockchain and backed by a reserve of assets designed to give it intrinsic value, but perhaps the biggest nod to consumers is that it will be governed by an independent association. 

Facebook has been plagued with user privacy controversies, which would lead many consumers to be sceptical to integrate the social media platform with their financial lives. 

However, the Libra Association is their attempt to placate those voices by establishing it as a governing entity that is made up of the likes of Visa, Mastercard, Uber, eBay, Spotify and Vodafone. 

The association, according to the white paper, will facilitate the operation of the blockchain and manage the Libra reserve, making them the only party able to mint and burn coins. 

The association notes in the white paper that it is important to move towards increasing decentralisation to ensure that there remains a low barrier to entry for the network. 

The chief executive of neobank Maslow, Kane Jackson, said the association’s concept showed that Facebook was aware of what was required in order for the coin to thrive. 

“Facebook seems to understand that widespread adoption of finance-based products will not be achieved without the decentralisation of their governance and a community-inclusive approach to managing them,” he said. 

Facebook has also launched a subsidiary company called Calibra that will handle its crypto dealings in an effort to separate user privacies, meaning Libra payments will not intermingle with Facebook data. 

Despite Calibra operating as its own app, the wallet will integrate directly into WhatsApp and Facebook Messenger to utilise the vast network of Facebook to promote cryptocurrency. 

It is this network promotion that excites Jasper Lawler, head of research at London Capital Group, who said the network would open other cryptocurrencies to billions. 

“Libra will breed familiarity of cryptos to a much wider audience. Two billion people will now be much more open to Bitcoin and other altcoins,” he said. 

As Libra is a stablecoin, it will have less volatility than a crypto like Bitcoin as it’s tied to the value of real-world currencies, Mr Lawler said. 

“The different properties of a stablecoin compliment rather than compete with cryptocurrencies like Bitcoin, Ethereum and Ripple. Being pegged to regular currencies make stablecoins less volatile and more suited to payment processing,” he said. 

The announcement saw an overnight rally for Facebook, but the community will have to wait to see how the rollout of the coin goes as no launch date has been set

Digital Platforms Inquiry Raises Significant Issues

The ACCC has released their preliminary report into Digital Platforms. A final report will be produced next year.

The issues raised are significant and far reaching, and questions the substantial market power players such as Google and Facebook have, the data they capture and monitise and their impact on the media. 94 per cent of online searches in Australia currently performed through Google.

Facebook and Instagram together obtain approximately 46 per cent of Australian display advertising revenue. No other website or application has a market share of more than five per cent.

They say there is a lack of transparency in the operation of Google and Facebook’s key algorithms, and the other factors influencing the display of results on Google’s search engine results page, and the surfacing of content on Facebook’s News feed.

Anti-competitive discrimination by digital platforms in favour of a related business has been found to exist in overseas cases. For example, in the European Commission’s 2017 decision, Google was found to have systematically given prominent placement to its own comparison shopping service (Google Shopping) and to have demoted rival comparison shopping services in its search results.

Monopoly or near monopoly businesses are often subject to specific regulation due to the risks of competitive harm. The risk of competitive harm increases when the monopoly business is vertically integrated. The ACCC considers that Google and Facebook each have substantial market power and each have activities across the digital advertising supply chain. Google in particular occupies a near monopoly position in online search and online search advertising, and has multiple related businesses offering advertising services.

This is their executive summary:

On 4 December 2017, the then Treasurer, the Hon Scott Morrison MP, directed the Australian Competition and Consumer Commission (the ACCC) to hold an inquiry into the impact of online search engines, social media and digital content aggregators (digital platforms) on competition in the media and advertising services markets. The ACCC was directed to look at the implications of these impacts for media content creators, advertisers and consumers and, in particular, to consider the impact on news and journalistic content.

Digital platforms offer innovative and popular services to consumers that have, in many cases, revolutionised the way consumers communicate with each other, access news and information and interact with business. Many of the services offered by digital platforms provide significant benefits
to both consumers and business; as demonstrated by their widespread and frequent use by many Australians and many Australian businesses.

The ACCC considers, however, that we are at a critical point in considering the impact of digital platforms on society. While the ACCC recognises their significant benefits to consumers and businesses, there are important questions to be asked about the role the global digital platforms play
in the supply of news and journalism in Australia, what responsibility they should hold as gateways to information and business, and the extent to which they should be accountable for their influence. In particular, this report identifies concerns with the ability and incentive of key digital platforms to favour their own business interests, through their market power and presence across multiple markets, the digital platforms’ impact on the ability of content creators to monetise their content, and the lack
of transparency in digital platforms’ operations for advertisers, media businesses and consumers.

Consumers’ awareness and understanding of the extensive amount of information about them collected by digital platforms, and their concerns regarding the privacy of their data, are also critical issues. There are also issues with the role of digital platforms in determining what news and information is accessed by Australians, how this information is provided, and its range and reliability.

Digital platforms are having a profound impact on Australian news media and advertising. The impact of digital platforms on the supply of news and journalism is particularly significant. News and journalism generate broad benefits for society through the production and dissemination of knowledge, the exposure of corruption, and holding governments and other decision makers to account.

It is important that governments and the public are aware of, and understand, the implications of the operation of these digital platforms, their business models and their market power.

The ACCC’s research and analysis to date has provided a valuable understanding of the markets that are the subject of this Inquiry, including information that has not previously been available, and has identified a number of issues that could, or should, be addressed. Many of these issues are complex.

The ACCC has decided that the best way to address these issues in the final report, due 3 June 2019, is to identify preliminary recommendations and areas for further analysis, and to engage with stakeholders on these potential proposals. Such engagement may result in considerable change from the ACCC’s current views, as expressed in this report.


Shadow profiles – Facebook knows about you, even if you’re not on Facebook

From The Conversation.

Facebook’s founder and chief executive Mark Zuckerberg faced two days of grilling before US politicians this week, following concerns over how his company deals with people’s data.

But the data Facebook has on people who are not signed up to the social media giant also came under scrutiny.

During Zuckerberg’s congressional testimony he claimed to be ignorant of what are known as “shadow profiles”.

Zuckerberg: I’m not — I’m not familiar with that.

That’s alarming, given that we have been discussing this element of Facebook’s non-user data collection for the past five years, ever since the practice was brought to light by researchers at Packet Storm Security.

Maybe it was just the phrase “shadow profiles” with which Zuckerberg was unfamiliar. It wasn’t clear, but others were not impressed by his answer.

Facebook’s proactive data-collection processes have been under scrutiny in previous years, especially as researchers and journalists have delved into the workings of Facebook’s “Download Your Information” and “People You May Know” tools to report on shadow profiles.

Shadow profiles

To explain shadow profiles simply, let’s imagine a simple social group of three people – Ashley, Blair and Carmen – who already know one another, and have each others’ email address and phone numbers in their phones.

If Ashley joins Facebook and uploads her phone contacts to Facebook’s servers, then Facebook can proactively suggest friends whom she might know, based on the information she uploaded.

For now, let’s imagine that Ashley is the first of her friends to join Facebook. The information she uploaded is used to create shadow profiles for both Blair and Carmen — so that if Blair or Carmen joins, they will be recommended Ashley as a friend.

Next, Blair joins Facebook, uploading his phone’s contacts too. Thanks to the shadow profile, he has a ready-made connection to Ashley in Facebook’s “People You May Know” feature.

At the same time, Facebook has learned more about Carmen’s social circle — in spite of the fact that Carmen has never used Facebook, and therefore has never agreed to its policies for data collection.

Despite the scary-sounding name, I don’t think there is necessarily any malice or ill will in Facebook’s creation and use of shadow profiles.

It seems like a earnestly designed feature in service of Facebooks’s goal of connecting people. It’s a goal that clearly also aligns with Facebook’s financial incentives for growth and garnering advertising attention.

But the practice brings to light some thorny issues around consent, data collection, and personally identifiable information.

What data?

Some of the questions Zuckerberg faced this week highlighted issues relating to the data that Facebook collects from users, and the consent and permissions that users give (or are unaware they give).

Facebook is often quite deliberate in its characterisations of “your data”, rejecting the notion that it “owns” user data.

That said, there are a lot of data on Facebook, and what exactly is “yours” or just simply “data related to you” isn’t always clear. “Your data” notionally includes your posts, photos, videos, comments, content, and so on. It’s anything that could be considered as copyright-able work or intellectual property (IP).

What’s less clear is the state of your rights relating to data that is “about you”, rather than supplied by you. This is data that is created by your presence or your social proximity to Facebook.

Examples of data “about you” might include your browsing history and data gleaned from cookies, tracking pixels, and the like button widget, as well as social graph data supplied whenever Facebook users supply the platform with access to their phone or email contact lists.

Like most internet platforms, Facebook rejects any claim to ownership of the IP that users post. To avoid falling foul of copyright issues in the provision of its services, Facebook demands (as part of its user agreements and Statement of Rights and Responsibilites) a:

…non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook (IP License). This IP License ends when you delete your IP content or your account unless your content has been shared with others, and they have not deleted it.

Data scares

If you’re on Facebook then you’ve probably seen a post that keeps making the rounds every few years, saying:

In response to the new Facebook guidelines I hereby declare that my copyright is attached to all of my personal details…

Part of the reason we keep seeing data scares like this is that Facebook’s lacklustre messaging around user rights and data policies have contributed to confusion, uncertainty and doubt among its users.

It was a point that Republican Senator John Kennedy raised with Zuckerberg this week (see video).

Senator John Kennedy’s exclamation is a strong, but fair assessment of the failings of Facebook’s policy messaging.

After the grilling

Zuckerberg and Facebook should learn from this congressional grilling that they have struggled and occasionally failed in their responsibilities to users.

It’s important that Facebook now makes efforts to communicate more strongly with users about their rights and responsibilities on the platform, as well as the responsibilities that Facebook owes them.

This should go beyond a mere awareness-style PR campaign. It should seek to truly inform and educate Facebook’s users, and people who are not on Facebook, about their data, their rights, and how they can meaningfully safeguard their personal data and privacy.

Given the magnitude of Facebook as an internet platform, and its importance to users across the world, the spectre of regulation will continue to raise its head.

Ideally, the company should look to broaden its governance horizons, by seeking to truly engage in consultation and reform with Facebook’s stakeholders – its users — as well as the civil society groups and regulatory bodies that seek to empower users in these spaces.

Author : Andrew Quodling PhD candidate researching governance of social media platforms, Queensland University of Technology

How you helped create the crisis in private data

From The Conversation.

As Facebook’s Mark Zuckerberg testifiesbefore Congress, he’s likely wondering how his company got to the point where he must submit to public questioning. It’s worth pondering how we, the Facebook-using public, got here too.

The scandal in which Cambridge Analytica harvested data from millions of Facebook users to craft and target advertising for Donald Trump’s presidential campaign has provoked broad outrage. More helpfully, it has exposed the powerful yet perilous role of data in U.S. society.

Repugnant as its methods were, Cambridge Analytica did not create this crisis on its own. As I argue in my forthcoming book, “The Known Citizen: A History of Privacy in Modern America,” big corporations (in this case, Facebook) and political interests (in this case, right-wing parties and campaigns) but also ordinary Americans (social media users, and thus likely you and me) all had a hand in it.

The allure of aggregate data

Businesses and governments have led the way. As long ago as the 1840s, credit-lending firms understood the profits to be made from customers’ financial reputations. These precursors of Equifax, Experian and TransUnion eventually became enormous clearinghouses of personal data.

For its part, the federal government, from the earliest census in 1790 to the creation of New Deal social welfare programs, has long relied on aggregate as well as individual data to distribute resources and administer benefits. For example, a person’s individual Social Security payments depend in part on changes in the overall cost of living across the country.

Police forces and national security analysts, too, gathered fingerprints and other data in the name of social control. Today, they employ some of the same methods as commercial data miners to profile criminals or terrorists, crafting ever-tighter nets of detection. State-of-the-art public safety tools include access to social media accounts, online photographs, geolocation information and cell tower data.

Probing the personal

The search for better data in the 20th century often meant delving into individuals’ most personal, intimate lives. To that end, marketers, strategists and behavioral researchers conducted increasingly sophisticated surveys, polls and focus groups. They identified effective ways to reach specific customers and voters – and often, to influence their behaviors.

In the middle of the last century, for example, motivational researchers sought psychological knowledge about consumers in the hopes of subconsciously influencing them through subliminal advertising. Those probes into consumers’ personalities and desires foreshadowed Cambridge Analytica’s pitch to commercial and political clients – using data, as its website proudly proclaims, “to change audience behavior.”

Citizens were not just unwitting victims of these schemes. People have regularly, and willingly, revealed details about themselves in the name of security, convenience, health, social connection and self-knowledge. Despite rising public concerns about privacy and data insecurity, large numbers of Americans still find benefits in releasing their data to government and commercial enterprises, whether through E-ZPasses, Fitbits or Instagram posts.

Revealing ourselves

It is perhaps particularly appropriate that the Facebook scandal bloomed from a personality test app, “This is your digital life.” For decades, human relations departments and popular magazines have urged Americans to yield private details, and harness the power of aggregate data, to better understand themselves. But in most situations, people weren’t consciously trading privacy for that knowledge.

In the linked and data-hungry internet age, however, those volunteered pieces of information take on lives of their own. Individual responses from 270,000 people on this particular test became a gateway to more data, including that belonging to another 87 million of their friends.

Today, data mining corporations, political operatives and others seek data everywhere, hoping to turn that information to their own advantage. As Cambridge Analytica’s actions revealed, those groups will use data for startling purposes – such as targeting very specific groups of voters with highly customized messages – even if it means violating the policies and professed intentions of one of the most powerful corporations on the planet.

The benefits of aggregate data help explain why it has been so difficult to enact rigorous privacy laws in the U.S. As government and corporate data-gathering efforts swelled over the last century, citizens largely accepted, without much discussion or protest, that their society would be fueled by the collection of personal information. In this sense, we have all – regular individuals, government agencies and corporations like Facebook – collaborated to create the present crisis around private data.

But as Zuckerberg’s summons to Washington suggests, people are beginning to grasp that Facebook’s enormous profits exploit the value of their information and come at the price of their privacy. By making the risks of this arrangement clear, Cambridge Analytica may have done some good after all.

Author: Sarah Igo, Associate Professor of History; Associate Professor of Political Science; Associate Professor of Sociology; Associate Professor of Law, Vanderbilt University

It’s time for third-party data brokers to emerge from the shadows

From The Conversation.

Facebook announced last week it would discontinue the partner programs that allow advertisers to use third-party data from companies such as Acxiom, Experian and Quantium to target users.

Graham Mudd, Facebook’s product marketing director, said in a statement:

We want to let advertisers know that we will be shutting down Partner Categories. This product enables third party data providers to offer their targeting directly on Facebook. While this is common industry practice, we believe this step, winding down over the next six months, will help improve people’s privacy on Facebook.

Few people seemed to notice, and that’s hardly surprising. These data brokers operate largely in the background.

The invisible industry worth billions

In 2014, one researcher described the entire industry as “largely invisible”. That’s no mean feat, given how much money is being made. Personal data has been dubbed the “new oil”, and data brokers are very efficient miners. In the 2018 fiscal year, Acxiom expects annual revenue of approximately US$945 million.

The data broker business model involves accumulating information about internet users (and non-users) and then selling it. As such, data brokers have highly detailed profiles on billions of individuals, comprising age, race, sex, weight, height, marital status, education level, politics, shopping habits, health issues, holiday plans, and more.

These profiles come not just from data you’ve shared, but from data shared by others, and from data that’s been inferred. In its 2014 report into the industry, the US Federal Trade Commission (FTC) showed how a single data broker had 3,000 “data segments” for nearly every US consumer.

Based on the interests inferred from this data, consumers are then placed in categories such as “dog owner” or “winter activity enthusiast”. However, some categories are potentially sensitive, including “expectant parent”, “diabetes interest” and “cholesterol focus”, or involve ethnicity, income and age. The FTC’s Jon Leibowitz described data brokers as the “unseen cyberazzi who collect information on all of us”.

In Australia, Facebook launched the Partner Categories program in 2015. Its aim was to “reach people based on what they do and buy offline”. This includes demographic and behavioural data, such as purchase history and home ownership status, which might come from public records, loyalty card programs or surveys. In other words, Partner Categories enables advertisers to use data brokers to reach specific audiences. This is particularly useful for companies that don’t have their own customer databases.

A growing concern

Third party access to personal data is causing increasing concern. This week, Grindr was shown to be revealing its users’ HIV status to third parties. Such news is unsettling, as if there are corporate eavesdroppers on even our most intimate online engagements.

The recent Cambridge Analytica furore stemmed from third parties. Indeed, apps created by third parties have proved particularly problematic for Facebook. From 2007 to 2014, Facebook encouraged external developers to create apps for users to add content, play games, share photos, and so on.

Facebook then gave the app developers wide-ranging access to user data, and to users’ friends’ data. The data shared might include details of schooling, favourite books and movies, or political and religious affiliations.

As one group of privacy researchers noted in 2011, this process, “which nearly invisibly shares not just a user’s, but a user’s friends’ information with third parties, clearly violates standard norms of information flow”.

With the Partner Categories program, the buying, selling and aggregation of user data may be largely hidden, but is it unethical? The fact that Facebook has moved to stop the arrangement suggests that it might be.

More transparency and more respect for users

To date, there has been insufficient transparency, insufficient fairness and insufficient respect for user consent. This applies to Facebook, but also to app developers, and to Acxiom, Experian, Quantium and other data brokers.

Users might have clicked “agree” to terms and conditions that contained a clause ostensibly authorising such sharing of data. However, it’s hard to construe this type of consent as morally justifying.

In Australia, new laws are needed. Data flows in complex and unpredictable ways online, and legislation ought to provide, under threat of significant penalties, that companies (and others) must abide by reasonable principles of fairness and transparency when they deal with personal information. Further, such legislation can help specify what sort of consent is required, and in which contexts. Currently, the Privacy Act doesn’t go far enough, and is too rarely invoked.

In its 2014 report, the US Federal Trade Commission called for laws that enabled consumers to learn about the existence and activities of data brokers. That should be a starting point for Australia too: consumers ought to have reasonable access to information held by these entities.

Time to regulate

Having resisted regulation since 2004, Mark Zuckerberg has finally conceded that Facebook should be regulated – and advocated for laws mandating transparency for online advertising.

Historically, Facebook has made a point of dedicating itself to openness, but Facebook itself has often operated with a distinct lack of openness and transparency. Data brokers have been even worse.

Facebook’s motto used to be “Move fast and break things”. Now Facebook, data brokers and other third parties need to work with lawmakers to move fast and fix things.

Author: Sacha Molitorisz, Postdoctoral Research Fellow, Centre for Media Transition, Faculty of Law, University of Technology Sydney

How Cambridge Analytica’s Facebook targeting model really worked

From The Conversation.

The researcher whose work is at the center of the Facebook-Cambridge Analytica data analysis and political advertising uproar has revealed that his method worked much like the one Netflix uses to recommend movies.

In an email to me, Cambridge University scholar Aleksandr Kogan explained how his statistical model processed Facebook data for Cambridge Analytica. The accuracy he claims suggests it works about as well as established voter-targeting methods based on demographics like race, age and gender.

If confirmed, Kogan’s account would mean the digital modeling Cambridge Analytica used was hardly the virtual crystal balla few have claimed. Yet the numbers Kogan provides also show what is – and isn’t – actually possible by combining personal datawith machine learning for political ends.

Regarding one key public concern, though, Kogan’s numbers suggest that information on users’ personalities or “psychographics” was just a modest part of how the model targeted citizens. It was not a personality model strictly speaking, but rather one that boiled down demographics, social influences, personality and everything else into a big correlated lump. This soak-up-all-the-correlation-and-call-it-personality approach seems to have created a valuable campaign tool, even if the product being sold wasn’t quite as it was billed.

The promise of personality targeting

In the wake of the revelations that Trump campaign consultants Cambridge Analytica used data from 50 million Facebook users to target digital political advertising during the 2016 U.S. presidential election, Facebook has lost billions in stock market value, governments on both sides of the Atlantic have opened investigations, and a nascent social movement is calling on users to #DeleteFacebook.

But a key question has remained unanswered: Was Cambridge Analytica really able to effectively target campaign messages to citizens based on their personality characteristics – or even their “inner demons,” as a company whistleblower alleged?

If anyone would know what Cambridge Analytica did with its massive trove of Facebook data, it would be Aleksandr Kogan and Joseph Chancellor. It was their startup Global Science Research that collected profile information from 270,000 Facebook users and tens of millions of their friends using a personality test app called “thisisyourdigitallife.”

Part of my own research focuses on understanding machine learning methods, and my forthcoming book discusses how digital firms use recommendation models to build audiences. I had a hunch about how Kogan and Chancellor’s model worked.

So I emailed Kogan to ask. Kogan is still a researcher at Cambridge University; his collaborator Chancellor now works at Facebook. In a remarkable display of academic courtesy, Kogan answered.

His response requires some unpacking, and some background.

From the Netflix Prize to “psychometrics”

Back in 2006, when it was still a DVD-by-mail company, Netflix offered a reward of $1 million to anyone who developed a better way to make predictions about users’ movie rankings than the company already had. A surprise top competitor was an independent software developer using the pseudonym Simon Funk, whose basic approach was ultimately incorporated into all the top teams’ entries. Funk adapted a technique called “singular value decomposition,” condensing users’ ratings of movies into a series of factors or components – essentially a set of inferred categories, ranked by importance. As Funk explained in a blog post,

“So, for instance, a category might represent action movies, with movies with a lot of action at the top, and slow movies at the bottom, and correspondingly users who like action movies at the top, and those who prefer slow movies at the bottom.”

Factors are artificial categories, which are not always like the kind of categories humans would come up with. The most important factor in Funk’s early Netflix model was defined by users who loved films like “Pearl Harbor” and “The Wedding Planner” while also hating movies like “Lost in Translation” or “Eternal Sunshine of the Spotless Mind.” His model showed how machine learning can find correlations among groups of people, and groups of movies, that humans themselves would never spot.

Funk’s general approach used the 50 or 100 most important factors for both users and movies to make a decent guess at how every user would rate every movie. This method, often called dimensionality reduction or matrix factorization, was not new. Political science researchers had shown that similar techniques using roll-call vote data could predict the votes of members of Congress with 90 percent accuracy. In psychology the “Big Five” model had also been used to predict behavior by clustering together personality questions that tended to be answered similarly.

Still, Funk’s model was a big advance: It allowed the technique to work well with huge data sets, even those with lots of missing data – like the Netflix dataset, where a typical user rated only few dozen films out of the thousands in the company’s library. More than a decade after the Netflix Prize contest ended, SVD-based methods, or related models for implicit data, are still the tool of choice for many websites to predict what users will read, watch, or buy.

These models can predict other things, too.

Facebook knows if you are a Republican

In 2013, Cambridge University researchers Michal Kosinski, David Stillwell and Thore Graepel published an article on the predictive power of Facebook data, using information gathered through an online personality test. Their initial analysis was nearly identical to that used on the Netflix Prize, using SVD to categorize both users and things they “liked” into the top 100 factors.

The paper showed that a factor model made with users’ Facebook “likes” alone was 95 percent accurate at distinguishing between black and white respondents, 93 percent accurate at distinguishing men from women, and 88 percent accurate at distinguishing people who identified as gay men from men who identified as straight. It could even correctly distinguish Republicans from Democrats 85 percent of the time. It was also useful, though not as accurate, for predicting users’ scores on the “Big Five” personality test.

There was public outcryin response; within weeks Facebook had made users’ likes private by default.

Kogan and Chancellor, also Cambridge University researchers at the time, were starting to use Facebook data for election targeting as part of a collaboration with Cambridge Analytica’s parent firm SCL. Kogan invited Kosinski and Stillwell to join his project, but it didn’t work out. Kosinski reportedly suspected Kogan and Chancellor might have reverse-engineered the Facebook “likes” model for Cambridge Analytica. Kogan denied this, saying his project “built all our models using our own data, collected using our own software.”

What did Kogan and Chancellor actually do?

As I followed the developments in the story, it became clear Kogan and Chancellor had indeed collected plenty of their own data through the thisisyourdigitallife app. They certainly could have built a predictive SVD model like that featured in Kosinski and Stillwell’s published research.

So I emailed Kogan to ask if that was what he had done. Somewhat to my surprise, he wrote back.

“We didn’t exactly use SVD,” he wrote, noting that SVD can struggle when some users have many more “likes” than others. Instead, Kogan explained, “The technique was something we actually developed ourselves … It’s not something that is in the public domain.” Without going into details, Kogan described their method as “a multi-step co-occurrence approach.”

However, his message went on to confirm that his approach was indeed similar to SVD or other matrix factorization methods, like in the Netflix Prize competition, and the Kosinki-Stillwell-Graepel Facebook model. Dimensionality reduction of Facebook data was the core of his model.

How accurate was it?

Kogan suggested the exact model used doesn’t matter much, though – what matters is the accuracy of its predictions. According to Kogan, the “correlation between predicted and actual scores … was around [30 percent] for all the personality dimensions.” By comparison, a person’s previous Big Five scores are about 70 to 80 percent accurate in predicting their scores when they retake the test.

Kogan’s accuracy claims cannot be independently verified, of course. And anyone in the midst of such a high-profile scandal might have incentive to understate his or her contribution. In his appearance on CNN, Kogan explained to a increasingly incredulous Anderson Cooper that, in fact, the models had actually not worked very well.

In fact, the accuracy Kogan claims seems a bit low, but plausible. Kosinski, Stillwell and Graepel reported comparable or slightly better results, as have several other academic studies using digital footprints to predict personality (though some of those studies had more data than just Facebook “likes”). It is surprising that Kogan and Chancellor would go to the trouble of designing their own proprietary model if off-the-shelf solutions would seem to be just as accurate.

Importantly, though, the model’s accuracy on personality scores allows comparisons of Kogan’s results with other research. Published models with equivalent accuracy in predicting personality are all much more accurate at guessing demographics and political variables.

For instance, the similar Kosinski-Stillwell-Graepel SVD model was 85 percent accurate in guessing party affiliation, even without using any profile information other than likes. Kogan’s model had similar or better accuracy. Adding even a small amount of information about friends or users’ demographics would likely boost this accuracy above 90 percent. Guesses about gender, race, sexual orientation and other characteristics would probably be more than 90 percent accurate too.

Critically, these guesses would be especially good for the most active Facebook users – the people the model was primarily used to target. Users with less activity to analyze are likely not on Facebook much anyway.

When psychographics is mostly demographics

Knowing how the model is built helps explain Cambridge Analytica’s apparently contradictory statements about the role – or lack thereof – that personality profiling and psychographics played in its modeling. They’re all technically consistent with what Kogan describes.

A model like Kogan’s would give estimates for every variable available on any group of users. That means it would automatically estimate the Big Five personality scores for every voter. But these personality scores are the output of the model, not the input. All the model knows is that certain Facebook likes, and certain users, tend to be grouped together.

With this model, Cambridge Analytica could say that it was identifying people with low openness to experience and high neuroticism. But the same model, with the exact same predictions for every user, could just as accurately claim to be identifying less educated older Republican men.

Kogan’s information also helps clarify the confusion about whether Cambridge Analytica actually deleted its trove of Facebook data, when models built from the data seem to still be circulating, and even being developed further.

The whole point of a dimension reduction model is to mathematically represent the data in simpler form. It’s as if Cambridge Analytica took a very high-resolution photograph, resized it to be smaller, and then deleted the original. The photo still exists – and as long as Cambridge Analytica’s models exist, the data effectively does too.

Author: Matthew Hindman, Associate Professor of Media and Public Affairs, George Washington University

Why it’s so hard to Delete Facebook: Constant psychological boosts keep you hooked

From The Conversation.

Here we go again: another Facebook controversy, yet again violating our sense of privacy by letting others harvest our personal information. This flareup is a big one to be sure, leading some people to consider leaving Facebook altogether, but the company and most of its over 2 billion users will reconcile. The vast majority will return to Facebook, just like they did the last time and the many times before that. As in all abusive relationships, users have a psychological dependence that keeps them hooked despite knowing that, at some level, it’s not good for them.

Decades of research has shown that our relationship with all media, whether movies, television or radio, is symbiotic: People like them because of the gratifications they get from consuming them – benefits like escapism, relaxation and companionship. The more people use them, the more gratifications they seek and obtain.

With online media, however, a consumer’s use provides data to media companies so they can serve up exactly what would gratify her most, as they mine her behavior patterns to tailor her online experiences and appeal to her individual psychological needs.

Aside from providing content for our consumption, Facebook, Twitter, Google – indeed all interactive media – provide us with new possibilities for interaction on the platform that can satisfy some of our innate human cravings.

Interactive tools in Facebook provide simplified ways to engage your curiosity, broadcast your thoughts, promote your image, maintain relationships and fulfill the yearning for external validation. Social media take advantage of common psychological traits and tendencies to keep you clicking – and revealing more of yourself. Here’s why it’s so hard, as a social network user, to pull the plug once and for all.

Buoying your ‘friend’ships

The more you click, the stronger your online relationships. Hitting the ‘Like’ button, commenting on photos of friends, sending birthday wishes and tagging others are just some of the ways in which Facebook allows you to engage in “social grooming.” All these tiny, fleeting contacts help users maintain relationships with large numbers of people with relative ease.

Molding the image you want to project

The more you reveal, the greater your chances of successful self-presentation. Studies have shown that strategic self-presentation is a key feature of Facebook use. Users shape their online identity by revealing which concert they went to and with whom, which causes they support, which rallies they attend and so on. In this way, you can curate your online self and manage others’ impressions of you, something that would be impossible to do in real life with such regularity and precision. Online, you get to project the ideal version of yourself all the time.

Snooping through an open window

The more you click, the more you can keep an eye on others. This kind of social searching and surveillance are among the most important gratifications obtained from Facebook. Most people take pleasure in looking up others on social media, often surreptitiously. The psychological need to monitor your environment is deep-rooted and drives you to keep up with news of the day – and fall victim to FOMO, the fear of missing out. Even privacy-minded senior citizens, loathe to reveal too much about themselves, are known to use Facebook to snoop on others.

Enhancing your social resources

The more you reveal, the greater your social net worth. Being more forthcoming can get you a job via LinkedIn. It can also help an old classmate find you and reconnect. Studies have shown that active use of Facebook can enhance your social capital, whether you’re a college student or a senior citizen wanting to bond with family members or rekindle ties with long-lost friends. Being active on social media is associated with increases in self-esteem and subjective well-being.

Enlarging your tribe

The more you click, the bigger and better the bandwagon. When you click to share a news story on social media or express approval of a product or service, you’re contributing to the creation of a bandwagon of support. Metrics conveying strong bandwagon support, just like five stars for a product on Amazon, are quite persuasive, in part because they represent a consensus among many opinions. In this way, you get to be a part of online communities that form around ideas, events, movements, stories and products – which can ultimately enhance your sense of belonging.

Expressing yourself and being validated

The more you reveal, the greater your agency. Whether it’s a tweet, a status update or a detailed blog post, you get to express yourself and help shape the discourse on social media. This self-expression by itself can be quite empowering. And metrics indicating bandwagon support for your posts – all those “likes” and smiley faces – can profoundly enhance your sense of self worth by appealing to your ingrained psychological need for external validation.

In all these ways, social media’s features provide us too many important gratifications to forego easily. If you think most users will give all this up in the off chance that illegally obtained data from their Facebook profiles and activities may be used to influence their votes, think again.

Algorithms that never let you go

While most people may be squeamish about algorithms mining their personal information, there’s an implicit understanding that sharing personal data is a necessary evil that helps enhance their experience. The algorithms that collect your information are also the algorithms that nudge you to be social, based on your interests, behaviors and networks of friends. Without Facebook egging you on, you probably wouldn’t be quite as social. Facebook is a major social lubricant of our time, often recommending friends to add to your circle and notifying you when a friend has said or done something potentially of interest.

A Facebook ‘nudge’ can push you to attend a local event. Facebook screenshot, CC BY-SA

 

Consider how many notifications Facebook sends about events alone. When presented with a nudge about an event, you may at least consider going, probably even visit the event page, maybe indicate that you’re “Interested” and even decide to attend the event. None of these decisions would be possible without first receiving the nudge.

What if Facebook never nudged you? What if algorithms never gave you recommendations or suggestions? Would you still perform those actions? According to nudge theory, you’d be far less likely to take action if you’re not encouraged to do so. If Facebook never nudged you to attend events, add friends, view others’ posts or wish friends Happy Birthday, it’s unlikely you would do it, thereby diminishing your social life and social circles.

Are you willing to say goodbye? Facebook screenshot, CC BY-ND

Facebook knows this very well. Just try deleting your Facebook account and you will be made to realize what a massive repository it is of your private and public memory. When one of us tried deactivating her account, she was told how huge the loss would be – profile disabled, all the memories evaporating, losing touch with over 500 friends. On the top of the page were profile photos of five friends, including the lead author of this article, with the line “S. Shyam will miss you.”

This is like asking if you would like to purposely and permanently cut off ties with all your friends. Now, who would want to do that?

Authors: S. Shyam Sundar, Distinguished Professor of Communication & Co-Director of the Media Effects Research Laboratory, Pennsylvania State University; Bingjie Liu, Ph.D. Student in Mass Communications, Pennsylvania State University; Carlina DiRusso, Ph.D. Student in Mass Communications, Pennsylvania State University; Michael Krieger, Ph.D. Student in Mass Communications, Pennsylvania State University