Please keep in mind that this is a conversation between two "personal" accounts, no business accounts involved. More so, we haven't accepted the new terms of use that "allowed" WhatsApp to access messages between personal accounts and business accounts.
Is WhatsApp scanning personal messages to target their ads as we are noticing? Weren't WhatsApp messages end to end encrypted? Is this a violation of their Terms of Use or am I missing something silly?
Here is how I think you could design a more robust (but less fun) experiment:
- Come up with a bunch of topics, write them down on slips of paper, put the paper into a hat
- Each Monday, draw three topics from the hat, send some WhatsApp messages about the first, Messenger messages about the second, and don’t discuss the third. Don’t put the topics back in the hat.
- If you see any ads relating to one of the topics, screenshot them and save screenshots to eg your computer with a bit of the topic
- Separately, record which topic went to which platform
- After doing this for a while, go through the screenshots and (each of you and your wife or ideally other people) give a rating for how well the ad matches the topic. To avoid bias, you shouldn’t know which app saw the topic.
- Now work out average ratings / the distribution across the three products (WhatsApp vs Messenger vs none) and compare
- pick said topic, something you never cared about before, talk about it but don't write any messages containing it; - for 1 month record every ad you see about it; - send a message about the topic; - for another month, record every ad you see about it
Comparing the number of occurrences will tell you what is happening.
This does not work. How did you come about the topic? Answer: it was in your brain, because advertising, trends among your peers and social connections, online trends real or astroturfed, etc.
That's why you end up with people thinking their phone is "listening" to them.
My wife and I routinely use ad blockers, private browser windows, browser profiles, and try to use as little ad-supported products as possible. This doesn't stop targeted advertising, I guess because most devices we use connect through the same IP. A couple of days after she starts looking up a city we want to travel to, I'll start receiving ads from airline companies or travel agencies, and even tours/cruises to said city/region. Fighting tracking and spyware is nearly a lost cause unless you become a digital Amish.
The best way for small ad tech providers to compete with "big tech" has been to cross lines that the bigger companies won't cross, this is an example for why there are a lot of profitable ad tech companies in the connected TV / video ads space.
Even if you use a VPN, the TV itself likely has a unique ID for ads, so someone just needs to see one request with both the true IP and the unique device ID and then remember that for the future. It's all very shady. TVs are very far behind the level of user control that phones and browsers provide because there's less scrutiny and its more fragmented across manufacturers (all of which want to get in on ad tech).
You can usually find some opt-out of the identifiers if you dig deep enough into the menus, because multiple laws and regulations require them.
But sure enough, it works just fine with no ads, no "free tv channels", and no voice functionality.
We have an LG tv and one of my family members hit accept all after an update and now my remote listens to us. To fix this properly I would need to factory reset which loses all of our streaming settings. I actually don’t because I have a separate ISP only for our TVs so there’s a bit of separation between our streaming use and phone activity
Yes, this retargeting is 'expected' and is not surprising. This is completely different from what OP is describing.
-How did you learn about the product?
-Have you ever searched for it?
-Did a friend of yours tell you about it? Do you think they searched for it?
-Are a lot of ads for it playing on TV channels you like? Could instagram know you like those TV channels?
-Is it something your neighbors got? Do you think there has been a spike in shipments of this product to your neighbors?
Eventually people start to “get” that scanning the text of messages is way more helpful for humans than it is for computers. They’ve got other data they can use.
I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause, whether the processing is done on device or they're sending all that data to a server to be processed.
We know our phones commonly listen for "smart assistant" prompts and audio beacons (https://www.nanalyze.com/2017/05/audio-beacons-monitor-smart...), so they don't seem to have any trouble abusing the mic access. Honestly without a whistleblower, there's little hope of really understanding how much data a company collects and what they use it for. At least sometimes we can see it in their own marketing materials. For example, https://advertising.roku.com/resources/blog/insights-analysi... tells us:
"Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing. For example, if a streamer is watching an NFL football game and sees an ad for a hard seltzer, Roku’s ACR will know that the ad has appeared on the TV being watched at that time. In this way, the content on screen is automatically recognized, as the technology’s name indicates. The data then is paired with user profile data to link the account watching with the content they’re watching."
None of the people I know who use those devices knew that was happening, but the info was out there at least. When so many people are watching everything you see and do and say who can ever know what every company is doing or what the source of any one ad is?
There were users under the impression that Roku was unaware of the content it was displaying? Like 4K snapshot or not, if I know a user is watching an NFL stream, I know that ad played.
Sure, they expect Roku would know if they launched Disney+ or Netflix, but not that they would knew exactly what movie you were watching or what specific scenes you viewed and for how long. Same with personal videos cast to your screen via roku. It's pretty reasonable they'd know you were streaming content from your other devices, or which apps you were using, but less reasonable that they'd be watching over your shoulder taking notes.
It's sort of like getting mugged once and then setting up a camera in a bunch of alleys to prove that muggers exist. You can even set up a camera of yourself running into dark alleys every night, but the odds of reproducing a mugging is still extremely low.
There's a certain kind of precision that convinces me it's real though. Precision is common. I look at a book on Amazon, and a FB ad for that book appears.
But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.
Aside from random coincidence, I could see this happening if you provided your personal information (especially email) for the loan application. It could have been shared to multiple underlying lenders alongside a data vendor who ultimately provided interest targeting (which can include car models) to an ad network.
Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB.
I suspect op may have researched the car model and got retargeted: some ad networks keep track of specific products you've shown interest in (not generic interest-areas like Google ads) and track you via a cookie. You may be visiting a completely different site that uses the same network, and get ads on the exact product you've spent minutes reading about.
If it's integrated into such activities, it might actually be a good explanation for all the other similar scenarios blamed on WhatsApp.
No matter how secure the platform, if you apply for a loan through it, the loan provider will “know” you want a loan, and happily sell that data.
I did that and I got a mugging on camera. The attacker was convicted.
If I had to guess, your whatsapp messages are e2e secured but keywords are sent to facebook when they match some condition. So if you message "happy birthday" to someone, they won't see that but the fact that the keyword "birthday" was found even if the word isn't included is sent to fb. That way they can say they're not snooping your messages.
Or maybe some combination of things we did previously led naturally to thinking about that yellow nail polish. I'm thinking about something like the trick where you ask somebody a bunch of addition problems that have 14 as the answer (what's 10+4? 2+12? 3+9? etc...) then ask them to name a vegetable and they will almost always say carrot.
I think it's interesting when a bunch of people chime in and say "Hey, yeah, I had some crazy thing happen to me, I'm in tech and understand how this stuff works, and there's a very small to zero chance this happened through some other parallel construction by the tech company, they just straight up listened to my conversation and showed me an ad".
This is what kicks off a handful of you to go packet sniffing and write up a blog post looking for this behavior. So yes, evidence is welcome but it doesn't seem like we are quite there yet.
To me, that is indicative that, contra the original claim, no such thing has ever been proven.
Is it verboten to say that?
The proof has to do with the technical details, not the authority figure posting it. If someone from the EFF wrote a blog post with the same content as these HN posters, I would be similarly dismissive of this as "proof."
They're saying "facebook has been caught multiple times doing this", which is not a personal anecdote, but an assertion that proof exists and is available.
So where is it?
And I'll flag if you violate HN guidelines, which you have.
From the HN guidelines: [0]
> Please don't post insinuations about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email [email protected] and we'll look at the data.
[0]: https://news.ycombinator.com/newsguidelines.html
I have to say, the fact that no one has done this makes me doubt it's real.
As hated as Facebook is, there's tons of motivation for people to catch them out with undeniable proof, and yet no one has done it.
FB having access to microphone makes sense for plenty of other completely innocent reasons (for example, if you can record a video from inside the app).
If this was actually true, I can’t help but feel that someone would have proven it technically by now instead of relying on these types of self experiment and anecdotes, especially given how commonly this is touted.
No.
Just tested this out, zero indication that the mic is hot on a recent iPhone with up-to-date software when recording a voice memo.
Edit: There ya go... downvotes for saying the iPhone has no indicator when you record audio.
https://litter.catbox.moe/cpvpr8.jpg
1) Does your iPhone still record audio when the screen is off?
2) Can you see the audio indicator when the screen is off?
3) If a background app starts then stops recording audio while the screen is off, would you have an indicator that it recorded audio?
Yes. iOS displays an indicator if an app has recently used the mic.
> Note: Whenever an app uses the camera (including when the camera and microphone are used together), a green indicator appears. An orange indicator appears at the top of the screen whenever an app uses the microphone without the camera. Also, a message appears at the top of Control Center to inform you when an app has recently used either.
Source: https://support.apple.com/en-nz/guide/iphone/iph168c4bbd5/io...
Also, I feel like the goal posts are moving quite fast in one direction.
The phone was sitting between two people having a conversation, one of them "swiped it open" meaning it was off to begin with, then was immediately displayed an ad for that conversation, and upon hearing this the tech-savvy person in the house understood what happened, confirmed it with the mic access to facebook in the settings, and then disabled the behavior.
Considering the original claim was "zero indication that the mic is hot" and now it's "zero indication that the mic is hot if the screen is off", I'd say that the goal post has moved considerably.
But if you want to know if Facebook is listening to you through the iPhone microphone, you should probably look at the screen for the indicator. iOS apps can't start recording on their own in the background, there's no API for that. If they are listening to you, they'd have to start the audio session in the foreground, which would allow you to see the indicator.
https://developer.apple.com/forums/thread/65604
https://stackoverflow.com/questions/70562929/how-to-start-au...
(Unless you believe that Facebook is using some kind of a private system API for this and is passing through the App Store checks)
I wrote the original "Wife swipes open the phone" comment, so that's the context you seem to be missing. Sure you can see a little dot on your phone when YOU run some experiment today and look for it, but was that indicator available in the exact situation where the targeted ad was displayed? No.
Also, this incident happened in the past and we know there have been dramatic API changes on both Apple and Facebook products. The limits of the API today don't reflect the capabilities that were available to developers in the past. I doubt Facebook is hacking the App Store process to use hidden APIs. It was probably just available in the past and my wife granted the facebook app complete access to the mic, so they took what they wanted.
I'd make sure to disable that permission today too, just in case.
One last thing is I just opened my iPhone again and hit record. I honestly didn't see the tiny orange pixel at the top of my phone until you pointed it out. I was basically looking for the green video indicator light to show. So I'm technically wrong about NO indication, you're welcome.
They explicitly said they were using an iPhone?
Please, stop with the sarcasm.
Okey, let’s say they manage to record us without a huge impact on our battery life. Now, how do you send these recordings or even the extracted keywords from a popular app, a client installed on devices controlled by the users and susceptible to reverse engineering and network traffic analysis without anyone noticing it?
It’s just too much risk and they don’t even need it, see my relevant reply here: https://news.ycombinator.com/item?id=32950204#32953216.
I swear that comment is sarcasm free.
They have far better information that shows I’m not interested in alcohol or extreme sports. Audio in the background is so low-signal that it isn’t worth showing ads based on it.
Even just transcribing speech something accurately is not something that was possible until the last couple of years. Yet this conspiracy theory has been around for a decade or more.
Yeah, it's probably not a coincidence that your wife is talking about X and is recognised by Facebook to be in a group of people that are interested in X.
You need to know if you got 3 topics of ads every day and 1/3 of them are related to that secret topic, OR if you get 300 topics every day and 1/300 are related to that secret topic. If it’s the former, it’s suspicious, if it’s the latter, it’s way less suspicious.
I’ve gotten Instagram ads for ketamine and I absolutely am not discussing or searching for it. I probably wouldn’t even notice a random topic if it’s not so absurd. I’m sure there’s tons of topics I don’t even realize I see.
Also, what we’re interested is if the text changed what was shown. If I saw ads for X last week but didn’t notice them, then spoke to a friend and noticed them and took a screenshot, it would appear to confirm the theory. Even though I was always seeing ads for X.
Ultimately, I don’t think people who are convinced of this theory will change their minds so it’s a moot point.
Is it so hard to believe that Meta is snooping on WhatsApp conversations? Meta, a company of unprecedented size that was built over monetizing your private data? A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?
Someone from this community, which generally means educated, tech-literate and sensitive to these topics shares a perfectly plausible observation, of something that has been experienced as well by plenty of other folks, me included; and then some people come and try to make up the most convoluted explanations (candy boxes from Kazakhstan just happened to be trending that specific day, nothing to see here, move along!) to this phenomena and try to shift the blame away from Meta. Why do you do this? Are you Meta employees? A PR agency they hired?
It's just baffling. Apparently some people DO want to be abused.
Plot twist: we all get ads about candy boxes from KZ now.
Despite the shady image they have, these companies go to great lengths to avoid doing shady things (because ultimately it’s bad for business). Not to mention the hundreds of tech employees that would have to be involved and keep quiet in this type of “conspiracy”. It’s incredibly unlikely, I truly believe that.
The tech companies you work for do often engage in illegal activities, and some of your collegues are complicit. I'm sure it is an uncomfortable thought for some of you, but this is all part of the public record.
This is certainly true in a lot of finance.
The sad reality is people are very predictable, even with basic data.
I can imagine the same thing done for text. The text might be encrypted, but interest keywords might be generated on-device and sent out-of-band.
I'm not claiming this is real, but I agree with GP.
For an example from non-FAANG companies, see illegal dumping of toxic waste by chemical companies, such as DuPont and PFOAs [1]. Despite knowing what they did was illegal, the math works out -- products with PFOAs were something like $1 billion in annual profit, and even when they got caught the fines and legals were a fraction of that, spread out over many years.
So I personally believe these companies 100% would do shady shit if it increases their profit margins. And why wouldn't they? There is no room for morals in capitalism, and the drawbacks are slim.
[1]https://www.nytimes.com/2016/01/10/magazine/the-lawyer-who-b...
As others above me have thoroughly explained, there are numerous ways Facebook could figure out what you’re reading about/listening to/viewing on the internet, which ultimately drives what you are chatting with your friends about. Reading your messages would actually be the most difficult and low fidelity way for them to try to mine this information. They can just see your entire browsing history and extract from there, since the majority of website have a tracking cookie that in some way phones home to Facebook.
If FB could do that, then FB would realize that these topics are not actually products they are interested in, so they wouldn't be showing ads.
Facebook show (many people) a lot of ads and they only need to get lucky a few times for you to think it’s uncanny. All the non-unique times an ad was not relevant will have blurred together and so you won’t easily remember that they were the vast majority of the ads you see. A little bit of feedback (eg if you dwell on the coincidental ad) may cause you to see more related ads.
I disagree with this to the extent that I would say the exact opposite is true.
Facebook (and others) have proven time and time again that they cannot correctly predict user behavior by locking out or banning users who actually did nothing wrong (because their algorithms predicted that the user was breaking terms of service or might be planning to). This happens over and over, even in cases not so complex as the "photos of my child to send to my doctor".
But on the flipside, Zuckerberg has been documented saying one thing to the public and exactly the opposite in private. Heck, Facebook has had memos and emails leaked where they talked about how they would say one thing in public (and to regulators) while doing the opposite secretly.
I believe that Facebook cheats and breaks agreements (and laws) in multiple directions all the time, often willfully. They've even been caught cheating their own ad customers by intentionally overstating the effectiveness and target accuracy of their ads.
I know that, unfortunately, this is what puts bread on your mouth.
But, really? Are you suggesting that Cambridge Analytica didn't happen? Did we all hallucinate that?
You guys jumped the shark already. These attempts at damage control are laughable.
CA happened but that has nothing to do with this. The policies that allowed CA to collect data were very public, Zuck enthusiastically talked about the open knowledge graph all the time prior to CA, much to the dismay of many investors. Facebook didn't lie in that case, they misjudged the potential to misuse open data access, and the potential for negative PR as a result.
By analogy, it's like you're the landlord of an apartment building and you don't lock the front door. You put up a huge sign saying "this door is unlocked, everyone is welcome". You sell ads for your building embracing the unlocked door policy. Then somebody walks in and photographs all the tenants through their windows. Suddenly people who didn't care about the unlocked policy are now very angry, and rightly so. But this is completely different from collecting data, lying about it, and operating a massive conspiracy to conceal the data use from literally tens of thousands of employees who would normally be able to see it.
We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.
Facebook has a deep culture of pathological lying. They lied to the FTC [1]. They lied to WhatsApp and to the EU [2]. They created an Oversight Board and then lied to it [3].
Each of those lies are more substantial than lying in a privacy policy.
[1] https://www.ftc.gov/system/files/documents/cases/182_3109_fa...
[2] https://euobserver.com/digital/137953
[3] https://techcrunch.com/2021/09/21/the-oversight-board-wants-...
They can of course also have their app de-crypt and re-encrypt the messages to the key of a requesting third party like police or hired reviewers if certain keywords are used.
Authorities could also have Google or Apple ship a signed tampered Whatsapp binary to any user or group of users, like protestors, that uses a custom seeded random number generator so they can predict all encryption keys generated and no one else, including Meta, will know.
The variant of end to end encryption where third parties control the proprietary software on both ends, is called marketing.
As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:
- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products
====
Popular theory is they can't see or store your messages, but can analyze them on the client and profile you (e.g. interested in brazil nuts)
I can perfectly mean just the audio exchanges when both parties talk.
Also: E2E does not imply necessarily that they do not know the key.
Where's the evidence? I don't know what ethos "Hacker News" is supposed to capture, but surely it's not superstition?
for a lot of people, no
> Meta, a company of unprecedented size that was built over monetizing your private data?
one of many companies, however "meta" does have the advantage that you can opt out of them, mostly.
> A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?
CA is interesting as it started out as an academic study, which was consented fully. CA then went on to scrape people's public profiles, which often included likes, friends, etc. This combined with other opensource information allowed them to claim to have good profiles of lots of people, the PR was strong. Should FB have had such an open graph? probably not. Should they have taken the rap for everything evil on the internet since 2016? no. There are other actors who are much more predatory who we should really be questioning.
> Are you Meta employees?
I think you place far to much faith into a company that is clearly floundering. Its not like it has a master plan to invade your entire life. Its reached it's peak and has not managed to find a new product, and is slowly fading.
However, as we all think we are engineers, we should really design a test! but first we need to be mindful of how people are tracked:
1) phone ID. If you are on android, your phone is riddled with markers. Apple, supposedly they are hidden, but I don't believe that they don't leak
2) account, and account is your UUID that tracks what you like.
3) your IP. if you have IPv6, perhaps you are quite easy to track. even on V4 your home IP changes irregularly and can be combined with any of the above to work out that you are the same household.
4) your browser fingerprint. (be that cookies, or some other method)
5) your social graph
method:
1) buy two new phones.
2) do not register them with wifi
3) create all new accounts for tiktock, gmail, instgram etc.
4) never log into anything you've created previously, or the fresh accounts on old devices.
5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"
report back.
Wait... If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary? Dude and his wife can simply pick at page at random from a magazine in a store, never search anything online about it, start talking about it using WhatsApp as if it was something of great interest to them. If they start getting related ads, obviously something shady is going on. There's no need for new phones / new GMail accounts / etc.
because you need to eliminate the chance of profiling by any other means.
Using the same phone as before means that the pre-existing profiles exist, which means that the relationship is already inferred. Because its trivially easy to track people, you need to eliminate all other variables.
I don't know what is happening in this specific case. Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.
It's hard to convince people at this point because many have lost trust in Meta as a company, and I understand that. But I still find it stunning that so many people are making so many false claims without any actual knowledge to back it up.
I didn't have in mind the scenario of a keyboard logging user inputs besides the normal functionality of WhatsApp. I find this theory to be very plausible. Not at all happy with Meta's privacy policy, but I agree that it is worth considering other threats.
From using a VPN that logs all incoming and outgoing traffic (NetGuard) on an Android One device, I've noticied that the default Google keyboard gets in touch way too many times with some distant servers. Whereas, an open source keyboard from F-Droid, FlorisBoard, does no snooping and gets updated solely through the app store.
Another consideration, there are companies that track and sell geolocation data. It's "anonymized" but so precise you know the street address a user resides at. It is not a stretch to consider "anonymized" retargeting from keyboard inputs.
I was dismissive of it in the past, as comments voted higher here are. However I've seen enough weird ads show up within minutes of making jokes about obscure topics that I suspect there is something going on.
The piece that might be missing here is third parties collecting signals, "anonymizing" them, and then ads get re-displayed through Facebook, Google, etc. It may not be the major ad platforms doing it directly. In theory this should be harder now with the iOS tracking restrictions.
For the skeptical, consider Avast's Jumpshot. Here millions of users thought they were protecting themselves when their raw browsing stream was being sold live to third parties. I They aren't the only company that has done that. https://www.theverge.com/2020/1/30/21115326/avast-jumpshot-s...
Proprietary encryption means users cannot verify or control the keys or the code that generates or uses the keys. The app can exfiltrate the keys or do any keyword processing on behalf of Meta as well which can include well intentioned features like forwarding paintext messages containing certain dangerous-seeming words to authorities or theoretically trusted third party review teams. Naturally they could also return -metrics- about frequency of word use back to Meta for ad targeting as well.
I too have been a champion of encryption and privacy at past companies only to have all my work undone and watch all the data become plaintext and abused for marketing by a new acquirer.
The only way end to end encryption solutions can avoid these types of abuses is when the client software is open source and can be audited, reproducibly built, and signed by any interested volunteers from the public for accountability.
Short of that it is really not that much different than TLS with promises Meta will not peek, at least not directly, today.
The RNG example is a way to create keys that make it trivial for "C in the middle" with the RNG details to extract the contents. They are still valid, just not useful as keys.
The Juniper attack and Dual EC exploit is a good real world example of compromising an RNG for passive decryption, although Dual EC was designed to be like that.
https://www.cs.utexas.edu/~hovav/dist/juniper.pdf
Given evidence at hand, it is hard to view Meta as anything but a bad actor.
Isn't this kind of splitting hairs? Does it matter if text information came from a "side channel"?
It seems like the promise Facebook makes is that 'your communication using whats app is secure,' that's certainly my interpretation of what "end to end encrypted" means. It is a promise of security. That means text is sacred and even text sent to giphy should be privileged from the ad machine.
The question being asked here is not "is it end to end encrypted?" It's "are my communications secure?" End to end encryption is just one element of that security.
Imagine a hapless military professional/politician downloading one.
The problem is one of alignment. Facebook wants to monetize whatsapp and wants the whatsapp data. That's why there was a mass exodus to signal in the first place. Facebook was weakening the protections of the app.
Due to the alignment problem Facebook can't advertise whatsapp as the secure and private choice because they are actively working to make it less secure and private. That's why Brian Acton quit (leaving $$$$$$$ behind) in the first place.
In a perfect world sure Facebook has the power and money to do a lot of things. So do the other megacorps. They don’t do them, and you’re correct it is the misalignment of incentives to due so.
But Facebook doesn’t control what keyboard you use on your phone and if the keyboard is sending every message you type somewhere, they can’t do anything about that and they aren’t lying that they can’t read your messages.
Whether or not you believe that they do in fact harvest the message data is up to you. But certainly people using keyboards that harvest data is very plausible to me as a vector for this stuff.
If I were to summarize my entire thoughts on WhatsApp, it's that it advertises security (e2e), while they only make money from violations of the security. The behavior OP expects is exactly the behavior a person would expect from this set of alignments.
If a leak is able to be monetized (even if it is google harvesting keyboard data and selling it back to FB) do you think that would be punished or rewarded?
If this very same post were for signal, I think the response we might expect is concern and investigation, not a response of defense and deflection.
There was an article several weeks ago about how a "special master" tasked with understanding what data Facebook collects on you was stonewalled because "even Facebook don't know what data Facebook collects."
https://news.ycombinator.com/item?id=32750059
"we don't want to be accountable for any data except the data that's part of the download your data":
> Facebook contended that any data not included in this set was outside the scope of the lawsuit, ignoring the vast quantities of information the company generates through inferences, outside partnerships, and other nonpublic analysis of our habits — parts of the social media site’s inner workings that are obscure to consumers. Briefly, what we think of as “Facebook” is in fact a composite of specialized programs that work together when we upload videos, share photos, or get targeted with advertising. The social network wanted to keep data storage in those nonconsumer parts of Facebook out of court.
> Facebook’s stonewalling has been revealing on its own, providing variations on the same theme: It has amassed so much data on so many billions of people and organized it so confusingly that full transparency is impossible on a technical level.
> The remarks in the hearing echo those found in an internal document leaked to Motherboard earlier this year detailing how the internal engineering dysfunction at Meta, which owns Facebook and Instagram, makes compliance with data privacy laws an impossibility.
Facebook doesn't even want to know if the WhatsApp is leaking data.
Excuses like "the user did something bad" aren't productive.
A warning that the users expectations (secure communications) do not match reality (3rd party keylogged communications) seems like the minimum level of responsibility:
https://maheshikapiumi.medium.com/allowance-of-third-party-k...
If WhatsApp derived information is being seen in advertisements, it is Facebook's responsibility. It is in Facebooks best (next quarters profits based) interests to not be responsible.
WhatsApp architecture is designed with the assumption that the server could be compromised and yet such an event should not result in any message contents being revealed. Furthermore, the encryption function is designed to ratchet and rotate keys so that a leak of a key at a given point in time would not compromise past and future messages.
So yes, I have a strong sense of confidence that message contents are not exposed to Meta and, given the bar set by privacy reviews, I don't think Meta would do some backdoor workaround like scraping the contents off the device and sending an unencrypted copy. To be clear, my claims are specifically around message contents and when it comes to certain metadata (ex. the sender/receiver, the names of groups, etc) I don't recall the exact details of how they are treated.
Now, despite the fact that I've said all this and that my knowledge on the matter is fairly recent, I'm not sure I could ever say anything with absolute confidence. The code base is huge and not open source. I obviously have not seen every line of code and as you pointed out, there's always a chance some company policy changes happened without my awareness. So I would say "highly" confident but not "absolutely" confident.
Meta has control over the app Sue uses. So they could send them to Meta unencrypted in addition to sending them to Joe in an encrypted fashion.
Or they just extract the relevant terms:
Sue->Joe: "Hello Joe, I'm so excited! We are going to have a baby! Let's call it Dingbert. You're not the father! Jim is. I hope you don't mind too much!".
Sue->Meta: "Sue will have a baby"
Insta->Sue: "Check out these cute baby clothes!"
Under Privacy > Photos, you can set “Selected Photos” instead of “All Photos” on a per-app basis.
Then when you go to add a photo to the app, you first go through an iOS prompt to select the photos the app will have access to. Only then do you go through the app’s photo selection dialogue.
I have all my apps set this way (or “None”).
Not saying it doesn’t work like you say, just saying it doesn’t look like it does.
I assume Instagram and friends would do the same.
I often just take the photo via Telegram instead, which automatically adds it to your photo roll and gives Telegram access to it. It works relatively well.
Then on Instagram (for example) when you go to post, you’ll get a message like “you’ve only let Instagram have partial access to your photos - Manage”. Tapping Manage will let you select photos that Instagram can access.
The other day I noticed the yahoo mail app on iOS was reading my clipboard for no reason. I’m going to start blocking photos on most of my apps.
You can reverse engineer those things and analyze your network traffic. You can’t have a client in a device controlled by the user, in this case an app, send anything to a server without anyone noticing it.
And frankly, they don’t even need it. Just with your contacts they can link you to your friends and common interests without even you having a facebook account, all you need is friends with a fb/ig account who have linked their accounts to their phones and use whatsapp.
The contacts are known to be sent to the server, they are known to be linked to facebook except in the european union where there is a different app from WhatsApp Ireland and a different privacy policy that specifically states (in the version outside of EU) that it shares your contacts with facebook and they are much more valuable and much less risky than reading your messages.
I frankly don't think people realize how much obfuscation of both app code and network traffic goes on under the hood. "analyzing network traffic" isn't a sustainable option when things are encrypted and behind dozens of layers of protobuf, websockets and other fancy protocols, and get updated and change around all the time. Far from everything is introspectable http, javascript and json these days, and that applies espeically to big apps like these. It's not hard to send privacy-sensitive data along with "legitimate" data like analytics at unexpected times and evade scrutiny.
Yes there's people that dedicate themselves to reverse engineering apps like this, but they're few and far between, and most of them focus on either the easy fish, or security vulns. Considering nobody's building public documentation on the protocols of these apps I'll have to assume it's hard enough and changes often enough to be worth the time of people without special monetary interests.
I agree with the rest of your assessment, there's way less "obviously malicious" ways to exfiltrate data about users than literally uploading users' pictures, since for example whatsapp stored unencrypted backups on google drive until very recently, among other things. I'm just trying to shed a light on the fact that apps like this have a lot of ways to accomplish this without raising too many eyebrows.
I see you’ve never heard of Jane Manchun Wong...
Turns out they were also building a database of everyone's face so they could build shadow profiles...
> her Instagram was showing ads for a store that was selling the same type of puzzle
How did she take the pic ?
Or they can say, technically it wasn't a message before it was sent. The dictionary definition[1] even mentions "send".
[1] https://www.oxfordlearnersdictionaries.com/definition/englis...
No, that's precisely what End-to-End encryption means.
Facebook loves to use newspeak, wouldn't surprise me if they applied newspeak to what "end-to-end encryption" means.
Signal might be the only app unable to read, but even that, I would not trust.
The quandary of what one allows to run on those implants sounds like a chilling sci-fi novel (chilling not because "but FAANG could read your thoughts!" but because people would absolutely still get them installed).
https://en.wikipedia.org/wiki/End-to-end_encryption#Endpoint...
It's illustrated in their example below that they if you say you're having a baby, meta can send some type of distilled ad-keywords to its servers (eg `[mother, baby]` if it knows the user is a woman based on their name/profile, but probably more sophisticated than that). The message you sent is still technically end-to-end encrypted, though,
https://news.ycombinator.com/item?id=32951417
I think it does actually no one except them can read them. If someone else can, then by definition it's not end-to-end encryption.
From https://www.definitions.net/definition/End-To-End%20Encrypti...
> End-to-end encryption (E2EE) is a system of communication where only the communicating users can read the messages.
E2ee means only the messages themselves can't be intercepted and read. But if anyone can actually prove fb acting on message contents, I suspect the EU banhammer would be interested.
But if the message is copied, read, analyzed and sent further on behalf of a third party before encryption, then that puts that third party in the middle between the sender and the recipient. A man in the middle directly undermines e2ee: "no one else reads your message".
It doesn't matter if the third party made the messaging app or not. What matters is whether information in your messages is accessible to anyone besides you and the recipient.
This said, analyzing messages for the purpose of ad display is creepy, whatever the way it is done.
Notice that "ends" in "end-to-end" are users, not applications. When an application forwards things to an entity, then that entity becomes an "end" of the conversation. When it displays a message to the user, the way the user wants, then the user is the end. When it processes the message and delivers results to Facebook, the way Facebook wants it, then the application makes Facebook the "third end".
In such scenario, Facebook had intercepted the message, just chose to forward only some extracted information (which may or may not be enough to reconstruct the original). This does not match the definition of "end-to-end encryption".
That's not right. First, it's technically an impossible, since users can't do encryption themselves - it's the application that does it. That's where the e2ee boundary is.
Second, we've got e2ee communication between non-user entities as well. There's are servers using for example zerotier which communicate e2ee through other nodes. Third, applications can definitely send the data to other parties automatically. WhatsApp executing backups as configured does not make it not e2ee.
We know the app decrypts it to display it. But if the app decrypts it to send it to the parent company, then it is by definition not end to end encrypted anymore.
If the app decrypts it, analyzes it and sends information about the message to the parent company, then the same thing is happening. The parent company is reading the message, INSTEAD of E2E encrypting it. It doesn't matter whether that reading happens on device or on the company's servers. E2E means the company is not reading it.
It’s possible that this data harvesting ad company has redefined what E2E means (to them) to advance their business interests.
HTTPS is E2E between the client and the server.
so what's the point? just inconvenience. better to use telegram at this point.
Not that I trust WhatsApp (I use Signal) but that's an odd comparison.
First of all, transport security (server-client encryption as you called it) like TLS is irrelevant for this discussion. All major platforms on the internet employ transport security these days, so this is a given.
The point I'm trying to make is that Telegram does not offer E2E encryption by default: (Non-"secret") Messages on Telegram pass through Telegram's servers unencrypted and are also stored there unencrypted, meaning that Telegram has access to all your messages. This is not speculation – Telegram openly admits to this in their FAQ:
https://telegram.org/faq?setln=ru#q-do-i-need-to-trust-teleg...
(See also the link contained therein.)
Meanwhile, the speculations in the present HN discussions aside, WhatsApp does provide E2E encryption, so – from this POV – is orders of magnitude more secure.
> We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.
They are saying they dont store or forward your message text, not that your phone doesnt send them topics of interest
Assuming they're not blatantly violating the policy (which I think they've done before), it's pretty easy to weasel out of that statement by only sharing keywords from the conversation, or only sharing the info with advertisers (but not WhatsApp and Meta), or redefining what a "personal conversation" is, or carefully redefining what "end-to-end encryption" means, or ...
There's no transparency, a huge power imbalance, and terrific pressure on WhatsApp/Meta to monetize as much as possible.
I've always suspected them of recording conversations, also why I think Android has gradually tightened permissions and visibilty around speech to text/microphone/camera use.
Looking at this from a reality perspective is not very helpful.
* Have pairs of mobile devices set up from factory configuration with WhatsApp and Instagram installed.
* Simulate conversations between each pair from select topics.
* Collect all ads from Instagram after the WhatsApp conversations from each device.
* Categorize ads to broad topics.
* Search for significant bias.
There are probably a lot of factors I'm missing here, and it's probably easy to introduce bias when there is none there. For example it's probably a good idea that a different person categorizes the ads into topics than the person handling the specific phone, otherwise the person might bias the categorization of the ads based on the conversation they had on WhatsApp beforehand. The person categorizing the ads should have no knowledge of the WhatsApp conversation that happened on the phone. The devices should probably be on different networks. There is probably a lot that I am missing here.
Once or twice may be a coincidence. Maybe. But this happens regularly and with startling specificity.
What could be listening? I'm a technologist like the rest of you. I know apps need permissions to the mic, I know it's not easy for an app to stay in the foreground. Is it my Roku? My smart TV?
Makes one want to go full Richard Stallman.
p.s. my wife just said it would be really funny if Google News showed an article now on people worrying about their tech listening to their conversations. I'll post an update if that happens...
e.g. You and a bunch of friends go to dinner and have a conversation about <topic x>, at least one of those friends googles something about that topic. You later see an ad related to <topic x> because you were targeted based on the search your friend did while they were near you.
If your wife potentially did anything digitally, related to the diabetes topic, its likely that you were targeted based on that.
Again, no idea what happened resultant to the story you shared, but whenever this sort of thing happens to me I try to appeal to Occam's Razor based on how much I know about how this tech works under the hood.
It was mentioned elsewhere in this thread, but this is where Baader Meinhof may apply. I don't how many times I've seen a targeted ad that was a "miss" in terms of recency bias. But I absolutely remember every time there was a "hit".
Both situations were targeted based on my digital behavior, but they're playing the volume shooter game. Taking as many shots as possible hoping eventually they score. This could be true in your case. The fact that you and your wife were having a recurring conversation about stem cell research, diabetes, etc. suggests that its likely that this is in your digital fingerprint at least once in the past (recent or otherwise).
Something I try to do now, when I'm being mindful about it, is note how often I see ads that are definitely in my interest bucket but are completely uncorrelated with any recent conversations I've had. That helps at least establish an anecdotal ratio of hits-to-misses that makes the Orwellian/dystopian much less reasonable on balance.
I was making the point that I have to assume much less to make the statement that companies are not passively recording, tokenizing, and analyzing every conversation users have without their consent.
Quoth Richard Feynman:
> “You know, the most amazing thing happened to me tonight... I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!”
You were gonna see ads, or news stories, or whatever. You happened to see this one. And it happened to connect to a conversation you were having. Amazing! What magic!?
Well how many other ads or other times didnt it connect to a conversation and you just don’t remember because it didn’t feel special? Probably many more times than it did work.
Lately I’ve been trying to find alternate explanations. Hm we were talking about going to a restaurant tomorrow. Why is my girlfriend getting all these restaurant ads on instagram all of a sudden? Oh right, because it’s Thursday, we live in a city, I searched Google for “Good Friday restaurants for a date” during our conversation, and we live on the same IP.
However, I also am reminded of Nassim Taleb's Fat Tony, who when asked 'what are the odds of flipping a coin heads 10 times in a row?' responds with 'fuggehdabowdit. it's a hustle.' There's a scientific response, which can be very naive in some ways, and there's the Fat Tony street analysis. As I've gotten older, I tend to value the latter.
Definitely a good perspective to keep in mind. But at Facebook scale, if a billion people flip an actually fair coin 10 times, around a million of them will get 10 heads. It would be understandable for those people to conclude that the coin is biased, but they'd be wrong.
Humans are very good at finding patterns in happy little accidents. Much better than the ad networks are in making those accidents happen. If the tech was as good as they say, we’d never ever see an ad that wasn’t a hyper relevant instant click.
And it seems unlikely that Apple is selling my iMessage conversations to Facebook. The “we searched for it, created an explicit signal, and forgot because it’s such a habit” explanation seems more likely.
Plus if a simpler method like “hey your location is at a restaurant on most Fridays, why don’t we start pushing restaurant ads on Wednesday” works well enough, why wouldn’t they use it?
So sweet boxes from Kazakhstan suddenly appearing in your ads is due to observation bias...
My mother recently remarked in a unique conversation that we have had a box of Golden Grahams cereal for a year and should find a recipe to use it up. She opened her phone to search, and lo and behold, the top recommendation after only two letters, R and e, was "Golden Grahams recipes". Not only had that never been a topic of conversation or search beforehand, nor did she have Google open on her phone, but you may have noticed that "Golden Grahams recipes" doesn't even start with R or e. This sparked a long conversation about how privacy really is something worth fighting for.
My only guess is that Google has the ability to listen in because we use Android phones.
It should be possible to run scientifically sound experiments to show if it's happening.
1. Nobody is reading your WA messages, the same topics can be learned from your browsing activity or other msgs, eg. by reading your sms texts.
2. Meta is reading your messages directly in-transit, server-side.
3. Meta is not reading your messages server-side, but the Meta apps extract keywords from your conversations and request relevant ads from the ad servers.
4. Another non-Meta app is doing the above.
5...
I think that 1 is the most plausible, however the original post is about "topics they never talk about", so assuming that WhatsApp is the only channel and they don't leak data in other ways (and there are many other ways to leak data), then 1 becomes unlikely.
3 is the most compatible. All the targeting can be done locally, so no end-to-end unencrypted message leaves the app. The app then sends your topics of interests to Meta.
4 again assuming WhatsApp is the only channel, then there is probably some malware somewhere, and it is unlikely that Meta accepts illegally collected data (they can do it legally, better, and with less potential trouble). There are however a few legitimate apps that can do the above. I am thinking about things like predictive keyboards, accessibility apps (screen readers, ...), backup apps (end-to-end encryption is about transmission, not storage), and the OS itself. I don't think Meta controls any of these, and I don't think they would buy data from them (Google and Apple are competitors after all).
So I would go for an accidental leak (case 1). For example, for the experiment to be meaningful, you shouldn't tell anyone about the test topic before you receive the ads. Or with the WhatsApp app hinting Meta about your topics of interest.
I agree, I'm pretty sure 2. is not the case; I just listed it as a theoretical possibility. Despite all the bad press and problems, FB has very (very) high integrity and standards, at least the parts I saw.
Tell that to Amazon who never fail to recommend me things I just bought.
The thesis is that recent vacuum cleaner purchasers are many times more likely (than the average person) to be looking to buy a vacuum cleaner.
Apparently about 20% of Amazon purchases are returned. And most returners are looking for a replacement. Some of the replacement product research is done before the return decision is made, so you get ads even if you have not initiated a return.
As much as Amazon doesn't want you to return your purchase, they really don't want you to buy the replacement somewhere else.
It would be interesting to measure how the ad ratio changes over time. Particularly when you exit the return window, but of course Amazon will know the return-likelihood curve with much greater precision.
WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp. This is because with end-to-end encryption, your messages are secured with a lock, and only the recipient and you have the special key needed to unlock and read them. All of this happens automatically: no need to turn on any special settings to secure your messages.
https://faq.whatsapp.com/791574747982248
This does not exclude an algorithm running running on the sender/recipient's App from scanning the content and sending suggestions to AD servers.
I guess the key lies in "what is sent" in the above statement. The casual reader might reasonably interpret as "no-one except the intended recipient can see _what I type_". But it doesn't say that. It only covers what gets _sent_. It doesn't say anything about what happens to the content outside specifically _sending_ it to the other party(ies).
The more obvious explanation is that before or after the e2ee (i.e within the app itself), an algorithm scans the content, categorizes it and sends this to Meta/Facebook.
In this scenario, *Nobody* has read the content other than the person you're communicating with.
Basically, let the AI figure out what ads get clicked the most for a given string of encrypted 24h window of chat history. Eventually, the AI is going to hit on its “Rosetta Stone”, even without ever formally decrypting the text, much less any human reading it.
With millions of conversations happening on WhatsApp, why shouldn’t that be possible?
And it’s not even a breach, technically, because nothing ever got decrypted, and the similarity vector generated by the AI have, per se, nothing to do with the content of the conversation or the individual that sent them. Run the same training algorithm again and they’d look completely different! Hence they can’t possibly be “personal data” in the sense of the law.
If you can discern meaning from noise, then your theory would work. But discerning meaning from random noise is obviously impossible (i.e. what if there is no meaning?).
If you leak information than you say, then the encryption is worthless. Harmful, even, because you think you have protection when you do not.
e2e encryption doesn't forbid to read the messages as you type or read them or read a screenshot of the screen or whatever they can do inside an app :P
They were caught activating your camera by "error" a while ago https://www.macrumors.com/2019/11/12/facebook-bug-camera-bac...
As per the experiment you did...
We did the same experiment with a female friend a while ago. We started talking about her pregnancy (a topic we never touched, as she was single and of course not pregnant) in a group chat, specifically targeting her. Sure enough, after a couple of days her fb and instagram were full of strolley ads (but not ours) :)
There's no reason you'd have noticed an ad about the puzzle in the wash of content and other types of ads.
yup, do you think all these img recognition stuff is made for fun? companies wants to use them to read our pics on your phones and profile us
Is it possible something like that happened?
In general, while anything is possible, my own occam's razor calculation is that if someone does have a way to get through ostensibly end-to-end encrypted messages, it's going to be government actors saving it for law enforcement/national security purposes. They wouldn't "waste" it on ad targetting. And if it's being secretly used for ad targeting so many people would know about it, people who aren't disciplined military bound by law to secrecy, that it would be quite likely to get out and be revealed and no longer secret.
How We Work With Other Meta Companies
As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:
...
- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products; and
(All this is just a guess based on OP's report and the above quote.)
Or, a similar idea is that ad companies don't really need to know anything about you so long as all your friends are "unprotected".
For example, you may pick "lawn furniture" as your "totally random" item to test WhatsApp. What you don't remember is that a good friend mentioned lawn furniture to you 3 days ago and just did 14 web searches on Google and FB marketplace to find some. They have strong metadata ties to you, so you get served ads on that topic too.
For years and years and years, there have been people claiming their voice assistant (for example) is listening in on their conversation to show ads, and so forth. And it's always anecdote, never any hard data.
And the thing is, if this were the case, it would be relatively easy to prove with a controlled experiment that other people can replicate. And yet, somehow, magically that never happens.
Sure, Google used to algorithmically read your Gmail to show you relevant ads, but they were totally open about that, and then they stopped because it weirded people out anyways.
If Facebook were mining Whatsapp messages for ad topics, they'd probably be as open about it as Google was, out of pure self-interest. Because right now so much of their advertising is about how Whatsapp is trustworthy because it's E2EE etc. So if they were secretly analyzing messages, it would blow up the reputation of their main marketing message. There's a good chance it would be business suicide for Whatsapp. A profit-driven company probably isn't going to take that risk.
To be honest, this post feels social-engineered by a messaging competitor or something. I'm not saying it is, but the personal touch ("silly little game with my wife"), the innocent questioning ("Is... or am I missing something silly?"), and the total lack of any objective evidence (e.g. screenshots of messages and ads) are all HUGE red flags.
If Meta really is doing this, it's pretty easy to prove with hard data, and that's going to become a front-page news story on the New York Times. The fact that that hasn't happened leads me to think it's much more likely there's nothing here.
To the contrary, WhatsApp advertising E2EE so publicly can be seen as actively trying to get away from past scandals and building trust.
So the idea that they would be undermining themselves at the same time doesn't make a whole lot of business sense.
Even Swiftkey keyboard (bought by microsoft) sends back telemetry to MSFT. Try a keyboard from Fdroid, but it may not be as feature rich.
I’ve never been able to reproduce these experiments. But keep in mind that I’m european and my WhatsApp app is slightly different - it is a version from WhatsApp Ireland (instead of WhatsApp Inc) which shares less data with third parties, the privacy policy is also slightly different for the european union.
Edit. Another idea: try to reproduce while disabling predictive text on your keyboard.
Google are definitely collecting data from gboard.
That may not be directly shared with meta but is likely to get indirectly shared through overlapping advertising identifiers. They won't be openly sharing your text, but they will be scanning to and flagging you as having interests in something in your text or something related to what you said, then sharing that with advertisers.
I have security code change notifications enabled, and around November 4, 2021 a large number of my unrelated contacts suddenly had security code changes. There wasn’t any media reporting at the time, but I remember some others mentioning it on Reddit[0] (would love if anyone here can scroll back in their message history and look for security code changes around the same time - maybe we can finally shine some light on this).
Since then I have assumed they are flat out lying about the fact that “not even WhatsApp can read your messages” (direct quote from the iOS app).
Also note that both iMessage and WhatsApp strongly encourage you to enable iCloud backups, which are not e2e encrypted and readable by Apple (Apple only claim backups are “encrypted” and that messages are “e2e” encrypted):
https://www.rollingstone.com/politics/politics-features/what...
At least Apple are not flat out lying like Meta, but they are still being incredibly deceptive with their marketing.
Use Signal if you care about e2e encryption. Everything else is a marketing slight of hand.
[0] https://www.reddit.com/r/whatsapp/comments/qm2ufw/security_c...
-> That's not completely true (at least for WhatsApp): It is possible to enable a e2e encrypted backup right in the chat-backup menu.
However, I still believe Facebook holds a second decryption key for all messages, which they rolled out along with their web access product as described above. So they are not e2e encrypted by any reasonable interpretation of the phrase.
I am not aware of any way to e2e encrypt iCloud backups, so the vast majority of “e2e encrypted” iMessage messages are readable by Apple.
E2EE does not mean anything in a world where both ends are owned by the transport layer.
I'm not saying they're doing anything wrong, you could be mistaken and information can be exfiltrated some other way.
But: Either you trust the transport layer or you don't. Saying "E2EE means the transport doesn't have to be trusted" while running a neigh impossible to reverse engineer binary on both ends distrubuted by the network --- *is* trusting the network.
Everything above is supposition from something I vaguely remember but not 100% sure.
Of course that reads backwards just as well, no need to implement complicated "end to end encryption" if both endpoints are hopelessly powned.
Tangental aside; it still confounds me where the business opportunity of WhatsApp resides for Meta if they "can't" get access to the data.
This has always been disingenious.
WhatsApp control the client, the client displays the unencrypted message, ergo WhatsApp can read the message.
It provably does when it interprets links and does a web page preview card.
Also... that is highly likely leaking your advert profile as even if the preview didn't then any visit to the website is outside of WhatsApp and is now tied to your IP, browser cookies, etc.
All of the above can be true without end-to-end encryption being broken or otherwise defeated on the server side.
Then randomly select one of those three topics to discuss on WhatsApp.
Finally, keep an objective eye out for ads about all three topics. Ideally, log every single ad you see.
I was sleepy so when I woke up I tore apart the pillow after brushing my teeth but there was nothing there. They must have taken it out surreptitiously when I was in the bathroom.
I know it wasn't my wife because I put her phone in a Faraday cage to stop her from using the Internet when I'm home. Unless... now that I think about it. Unless she's secretly working for Facebook. She said a friend of her's could get her a Portal webcam. No one buys those things unless they're working for Facebook!
This was as part of a FB/GOOG deal where the storage for WhatsApp backups did not count for your Google drive quota.
Recently the backups did finally become encrypted as well. With a key known to the WhatsApp app. (On Android, stored in a file called "key" in the apps local storage)
However, when you restore the backup, where does the key come from? From the WhatsApp servers, obviously.
So still, FB and GOOG together still have full access to your daily backed up messages.
And the free storage deal is still there, of course.
Please do correct me if I'm wrong and you know better.
Let me tell you about another third party that is not mentioned here. The telecom operator. Once, in 2019, I was reinstalling WhatsApp on my smartphone. I checked the Internet traffic records (outbound) with curiosity. This led me to finding out that my phone was reaching out to *whatsapp.com (Meta's servers) and Belgacom (Belgium's telephony provider). So, my phone's data was being routed through the parent company's devices and some other third party services like a national telecom company. I don't if it's still the case nowadays, but since that day, I have more worries about their encryption and data relay practices.
What happened with Skype before was that Microsoft would ping any links from their servers, so it was really easy to prove it by generating a new web server, publishing it nowhere and then mentioning it in a chat. This caused some publicity and they stopped the practice. Skype didn't guarantee E2EE at that time though.
But perhaps you could do a similar 'clean room' excerise to prove it. I don't think they would break the E2E by the way but perhaps there is something calling home in the app itself.
If 200 people send preselected product communique over brand new devices and are 800% more likely to be targeted by ads for those or related products by the control group of the same size then we have enough for a legal case and discovery.
It really should be airtight if they're not going to weasel out of it.
I kinda doubt Meta is doing it but on the other hand they do seem to be in heavy weather right now.
Plenty of people have attempted to prove it, and gone radio silent after trying.
There are billions of users, and tens of millions of them are technical, and tens of thousands of them work for these companies. Yet there's 0 evidence of this ever actually happening. Does this not seem strange to you?
The reason it's not all over the media is precisely because it doesn't hold up.
Anybody who's dedicated to this can select truly obscure terms, fully document their private chats and full internet usage and ads shown, and show whether this effect is actually happening.
The reason we don't hear about this is because the snooping doesn't appear to be happening. So you just get a bunch of people sometimes claiming it "seems like" ads are coming from private chats, because coincidences do happen statistically so it will always happen to some degree to some people.
The reason it's not all over the media is simply because the phenomenon doesn't appear to exist, not because it's hard to prove.
type a message -> msg encrypted -> msg sent -> msg received -> msg decrypted -> msg viewed in app
Then consider the following:
type a message -> msg encrypted -> msg sent -> msg received -> msg decrypted -> app scans content and sends classification to ads server -> msg viewed in app
Both are end-to-end encrypted.
Another hypothesis is that you are taking other steps such as searching for that topic so that you can send something to your wife those extra steps might be enabling tracking.
Can easily be proven by sending someone a link that only you can access (hosted on the LAN which the recipient can’t reach) and seeing if the recipient still sees a preview.
Online ads ... if you've ever paid for one you know they are desperately in need of targeting. We consumers provide our info directly to folks who sell ads, under terms and conditions that we don't understand. Of course they're making use of this free resource. They'd have to be idiots not to.
So ... with respect, the wave of denial in the comments here ... 10 years ago, that would have seemed "naive but understandable." Today it's just weird. Almost like some kind of absurdist comedy. It's totally disconnected from the world we actually live in.
Irony notice. This message contains irony.
I didn't agree to the recent WhatApp nor Facebook's TOS so no longer have their product on my devices. I suggest you do the same, or just sit back and enjoy the specialised, relevant, targeted ads, but think twice before each send.
A button next to every ad that says "why me?" that details every byte of my data scraped to generate that specific ad. Was it a GPS location from an hour ago? Did you scan through my photos? Did you figure something out from my youtube watch history? TELL ME!
The problem is that the GDPR isn’t enforced enough even at a very basic level, let alone technicalities like this.
I don't think there is any fault with the e2e encryption. Humans are very bad at seeing causality when there is none, or accidentally leaking their thoughts into the search box.
There could also be leaks with the clipboard, photo gallery or keyboard - all things that freemium apps love to scan in the background. The way real-time-bidding ad markets work, anyone that the data leaks to can influence ad ranking - doesn't have to be FB/Meta.
If you did a true blind study, I think you'd find no link.
For example, start with a list of 1000 products images and accompanying text. Select 2 at random each day. Flip a coin to decide which to send (keep the other as control). Cover the screen so the user can't see what they've sent/received. Then, a few days later ask the user to select which product they think they sent.
I'd bet that even after months of doing this, there will be no finding of a leak.
But it is possible for the client itself to build a map of advertising id -> interests and send that over to meta separately. This would be similar to one of chrome's proposals.
If you try this several times, the messages are not working and the corresponding ads show up you can be sure that it is not because they are reading your messages. Which does not rule out the possibility that they might read them, but at least you can be 100% sure that your ads are not showing up because of your messages in this case.
If they do show up you could then try discarding other factors like client-side keyword analysis e.g.: talking about very generic things which are not useful to ad trackers like "how you going?", etc (ie. awkward elevator conversations), but it is harder to test a null hypothesis for client-side keyword analysis.
Of course, if it does not work, use a proxy.
My answers being yes - yes - no, the question of 'do they listen to target the ads they try to make me display' is pretty irrelevant to me. I can't trust them not to nor check reliably if they do.
If you try to address a different question such as 'do they really encrypt reliably to protect your conversations from being snooped on without their authorization', the threat analysis may differ. In that case they have incentives aligned with yours and are probably faithfully trying to effectively protect your/their data.
At the end, I'd estimate the probability of the scenario and how I value the consequent loss of privacy. Then accept/mitigate/refuse the risk accordingly.
Significant customer data exposed in attack on Australian telco - Subscribers have questions – like 'When were you going to tell us?'
Boeing to pay SEC $200m to settle charges it misled investors over 737 MAX safety - Ex-CEO also on the hook for $1m after skipping over known software issues
Privacy watchdog steps up fight against Europol's hoarding of personal data - If you could stop storing records on people unconnected to any crimes, that would be great
Meta accused of breaking the law by secretly tracking iPhone users - Ad goliath reckons complaint is meritless – but it would, wouldn't it?
Federal agencies buying Americans' internet data challenged by US senators - Maybe we don't want to go with the netflow, man
.. and I'm only halfway through!
Did you think the rule of law is there to protect you? Do you think corporations won't break the law to get access to information? Have you learnt nothing?!
Personally I use Telegram which works fine for me and I have taken a fair amount of flak for saying it is a better choice than WhatsApp.
I'll still try again: There is more to security than protocols and algorithms. If you value your privacy, don't use a free messenger from a company with a long record of sleazy behaviour.
It's very easy to blame an application, but the problem with the modern ecosystem is that it's all very interconnected. Signal makes a point of having a setting that sends a request to the keyboard to disable personalized learning, but even that is a request. There isn't a guarantee that it complies.
Companies that deal with data will not use a single source of information, but a huge variety of sources and your smartphone is like a huge vacuum that is pulling in everything it can gather from you through any means possible.
Lastly, it could also be observation bias as others have mentioned, but to truly be able to regain control, you would need to take a variety of steps to make this change.
i know that's wild, but also often true. humans are bad at randomness. there may be no direct leak at all of your test topics, they might just be guessable based on everything that is known about you, people like you and things you've been presented or looked at.
Maybe this could be counteracted by something like:
1. Generate multiple random topics and only send one across WhatsApp. Count "related" ads for each.
2. For every other random topic don't send ti across WhatsApp and see if you still find "related" ads.
assuming that's all true, good news is that they're not doing any totally illegal spying. bad news is, they don't have to.
> If you are seeing dozens of ads a day on Instagram and suddenly you have some "random" topic in your head you will mentally connect them.
ahh yes. incidentally that sort of behavior is often linked with the onset of poor mental health. there are some interesting questions around the limits of personalization and impacts on mental health. if the machine behaves like magic, specifically on a personal level, does that encourage magical thinking?
People have been claiming this for years, and yet we have never seen actual evidence. I completely understand being creeped out by the surveillance shops, and I've seen coincidences that weirded me out.
But if this is going on, then there is network traffic about it. And busting FB with real proof of audio surveillance would be a massive feather in some researcher's cap.
I don't buy it.
The case that was the final straw for me was when I was chatting with my partner and remembered a funny song from my childhood, so I opened YouTube on Safari and showed it to her. A couple of minutes later, she opens Instagram (on her own phone) and the first "follow suggestion" is the artist from the song. She had never heard of this song before, much less of the artist (which is not famous at all).
I would understand if everything happened on my phone/accounts, but the suggestion was on her phone and account. I don't think they're literally listening to you, but there's definitely a GPS-based user relationship table somewhere which reflects what you do to everyone they think has some connection to you and is physically close to you.
They conclude that ads are shown to people you know. So you search for a product and then your partner gets ads for this product too, as they know you spend a lot of time together through other tracking methods.
I've seen this when I am sharing IP addresses with someone because we're both connected to the same WiFi. Especially if you block meta from being able to get any data about you but the other person doesn't. Meta then get data from that other person because it can't see anything from yourself.
It helps when I'm looking for a gift for my partner - I can see what she's been looking at recently because its most of my adverts in Facebook.
- systematically record how often this happens, in contrast to other ads, and for each of the reason topics
- record for each random topic if you have also mentioned it anywhere else, e.g. in a Google search it some other digital media
- make the choice of random topics more random, ie. not depending on current moods (which might be biased through subtle, external nudges
-...
These are of course just pointers, and by no means a proper experimental setup.
I'm aware that this might take the fun out of your playful approach. However, you might be surprised by the results, in whatever direction. Also, it would give you a much more grounded fundament for further discussion. Of course you can just keep doing it the current, less tedious way. I'm only suggesting it because you seem to be interested in the topic and it might be more satisfying for yourselves to turn this into a little citizen science project.
Take this scenario for example:
1. E2E is not broken in anyway by Meta/Whatsapp. In this scenario only both WhatsApp clients (and thus you and the other person) have access to the messages. This is required for you to even read the messages in the first place.
2. The WhatsApp local instance is running on YOUR device under YOUR username / digital identity. From a legal perspective is it possible that since the app is running under your username that it is also considered "you" ?
3. If number 2 is true then it might give the local WhatsApp instance legal shield to read and do anything it wishes (locally) with the message content. And then of course this could be sent separately back to Meta/Whatsapp in a very small format easily mixed in with other traffic.
If you don't know this already, use App Warden to remove spyware handlers on Android and use RethinkDNS to block their ad domains.
If this is just in text—and I'm definitely not defending Meta here—could it also be that the ads you see have got us so figured out already? The topic you choose to talk about may be influenced or seeded by your environment (online/offline), and one thing leads to the other almost deterministically.
Here's an experiment: try rolling a die a few times or using a random number generator to pick one word or more from a list like the EFF wordlists [0], and then talk about that exclusively.
[0]: https://www.eff.org/deeplinks/2016/07/new-wordlists-random-p...
My counter argument: I use WhatsApp all the times and nothing I talk about on it ever shows up in my ads. A hefty amount of adblock may help here, as does the fact I live in the EU where the worst tracking is illegal.
Something on your phone is probably leaking data. Most suspect are third party keyboards, accessibility apps, apps with access to your photos and videos, or even Google Assistant. Third party keyboards can easily track what you're typing, accessibility apps can parse what you're saying or typing, and Google Assistant will take a screenshot of your current screen when you invoke it.
Other options are clipboard scanning (i.e. on older operating systems) and perhaps link preview services breaking out of e2e.
Finding what app is selling your information is difficult. For starts, you don't know which device is leaking. Ad companies are smart enough to see the connection between you and your wife. Her search results alone can probably make ads appear on your device!
Also consider the Baader-Meinhoff phenomenon. You can only track special topics if you track the topics of all ads and apply some statistical analysis. If you get blasted with ads all day, you'll notice the ones that you're on the lookout for. Pausing your scrolling through the app to take a screenshot will then reinforce the e-stalkers' algorithms.
If you have two old phones lying around, try repeating this trick with phones that are completely wiped, without any Google account logged in, with firewalls to block anything but WhatsApp from talking to the internet. I bet you'll find that those devices won't generate ads.
Why do I think that? For starters, enthusiasts decompile and analyse WhatsApp APK files all the time, in search for rumours and beta features to report about on tech news sites. If at some point WhatsApp added a secondary information channel about your messages (whose encryption is reasonably proven), reporters would've made a HUGE story out of it. A single line of decompiled code can send tech outlets into a frenzy of Meta accusations and let loose the EU's regulatory commissions for lying to customers. It'd be the scoop of the year!
Personally, I think "Google's keyboard or Instagram's gallery scanner is leaking my data" is a lot more likely than "WhatsApp has never been analysed enough to find the magic leaking code".
it wouldn't be surprising that whatsapp gleans info from your comms and builds a profile of you, from which ads get injected. whatsapp is not selling your actual comms, but the likelyhood you'd be interested in certain things/products. sort of like how the three names supposedly only store metadata of your calls, not the actual call.
I am to this day baffled by gullibility of people believing that WhatsApp is E2E encrypted.
Then, my friend asked me where do I want to go the most if I am to go scuba diving. I answered "Phillipines". My friend then said "Maldives is also great". We never searched for anything, just casual conversation. A few minutes later I look at my booking app, guess what were the top suggestions - Maldives, followed by Phillipines. Must be coincidence.
It would be better to use signal or element, something that tries to solve the key exchange problem. And if you are even more concerned, run their respective server software on your own hardware. Then you can inspect what goes in and out.
The fact they are spying on WhatsApp messages isn't really surprising.
It's not too hard to transcribe text into a few kb, or summarize them further on the client, and then upload that together with the rest of the data.
I'd be very interested in seeing those studies and whether they match up.
And if they have the keys then they can still read your messages!
And, given it's Meta, there's no way they are not doing this.
Pick two topics every time. Send one topic to your wife on Whatsapp. Write paper messages to your wife about the other topic and give it to her.
Track how often you see advertisements for topics in both channels. A significant difference in any one channel will be worth sharing.
Encryption is useless against advanced covert radars. Big Tech knows and benefits by lying by omission.
An easy way to test is by talking about something (but not researching it) that’ll put you “in market” for advertising targeting.
You must say things to show you’re in-market. Eg. “I really want to buy a new AWD pick up truck” — include some brand names and specifics, which will give the spy device more confidence about the ads to show you, something like “Toyota Tundras seem better than the Ford F150, but maybe I should get the Hummer EV truck”.
Try variations of this “conversation” a few times over 2 days to help give the ML model confidence. Then, monitor ads on social media, pre-roll ads and new ephemeral recommendation tiles on YT that suddenly feature videos on this topic.
[0] https://en.wikipedia.org/wiki/Targeted_advertising
There might still be collusion but it'd likely be far more transparent.
Three letters...begins with Y
So, I'm guessing that they not only read your messages, but also run TTS on your calls and serve relevant ads.
<<Your Messages.
We do not retain your messages in the ordinary course of providing our Services to you. Instead, your messages are stored on your device and not typically stored on our servers. Once your messages are delivered, they are deleted from our servers. The following scenarios describe circumstances where we may store your messages in the course of delivering them:
Undelivered Messages. If a message cannot be delivered immediately (for example, if the recipient is offline), we keep it in encrypted form on our servers for up to 30 days as we try to deliver it. If a message is still undelivered after 30 days, we delete it.
Media Forwarding. When a user forwards media within a message, we store that media temporarily in encrypted form on our servers to aid in more efficient delivery of additional forwards.
We offer end-to-end encryption for our Services. End-to-end encryption means that your messages are encrypted to protect against us and third parties from reading them. Learn more about end-to-end encryption and how businesses communicate with you on WhatsApp.>>
https://www.whatsapp.com/legal/privacy-policy/?lang=en
Assuming they are compliant with their privacy policy, they don’t even have the messages.
So, they say the protection is there once the encryption has been applied. They say nothing about what happens to the content before or after that on the end user's devices. That handling is however covered by other legitimate use clauses in the privacy statement. This covers keyword scanning for targetted ads (so a defence lawyer will say at some point.)
send messages extolling the utility of brrlftz discuss how every body not taking advantage of brrlftz will miss out. let your SO know that you need as much brrlftz as can be produced and delivered.
keep your eye out for cheap imitations offered to you.
Sadly, they just never seem to be up to the task.
Just do your own research, man...
The most notable one being renting an apartment. I viewed an apartment then sent a message to the agent requesting window grills or latches and then had adverts for that stuff straight away.
When ever I mention this on HN I get downvoted with lame excuses as to why it happened but none of them are plausible.
My friend messaged me saying he needed to go buy kitty litter and I get adverts for cat toys and supplies on Facebook despite not even replying to him?
Anyone who believes WhatsApp is really e2e is a fool IMO.
- the agent present - you both had smartphones on you - they both had bluetooth data enabled - they are both signed into location services - you'd added the agent to your contacts or they add you to theirs - you took pictures (and didn't strip the EXIF data) - you exchanged emails with the agent via gmail - you rang each other via a VOIP service - you used a map app to find the place - you found the apartment via an online listing - you found your current place some multiple of a standard contract length ago (e.g. 1 year) - your data in aggregate statistically matches that of other people's who also looked for new apartments
The metadata just drips off and they sell it, it's repackaged, bundled up and sold on to others who then target you (personally or as part of a group) in their ad campaigns.
We exchange listings to apartments to view so absolutely they know I'm looking at apartments and all that stuff.
Getting adverts for apartments is fine because I know I searched and browsed sites etc. The issue is for 2 weeks I looked at apartments. Got some adverts for AirBnb, other property sites, etc.
The particular apartment I looked at, I viewed it, walked around it, etc. Went home, pondered to myself about the apartments I had looked at that day...
Looked at facebook, etc...
Hours later I go to WhatsApp, message the agent, something like.
"I really like apartment X but the only issue is the windows don't have grills and its the 23rd floor, would you mind asking the landlord if we can get grills or window latches? If they are willing to do that then I think I'll take that apartment"
(give or take on the message as it was 3 years ago? That I took it...)
Agent had not read the message yet, infact he didn't read it until well after 6pm.
I go to facebook (2-3 minutes?) after messaging him, and I have adverts for grills and window latches.
I had not googled them, the only mention ever of grills or latches was a message on whatsapp a few minutes earlier...
---------------
Even if they are scanning the messages on the client, as far as I'm concerned it's no longer e2e. They are scanning my personal message, analysising it, building a profile on me, and using it for advertising.
Still, use Signal is the answer. The more people that use it, the more people will use it and the less we'll have to worry about whether messaging apps owned by ad companies are e2e.
That is always the one that worries me.....
Jan Koum not only had unvested shares, he was on the BOARD of Facebook and decided to quit. Along with the donation to Signal, I don't think the magnitude of this action was widely appreciated at the time.
I prefer to think that some of this is chance, or Facebook just guessing really, creepily well. But after the 3rd or 4th time we observed this same scenario play out, trust has eroded, even if I have no evidence as to how this is happening. It's time to change up the tools.
I now self-host an open source chat application, and we have not had another repeat of this kind of creepy ad invasion.
Don't. People are way too tolerant of serial liars, psychopaths and rapacious corporations.
Your agent might have searched for latches after you messaged them. Your friend likely searched for cat litter before he messaged you.
In each of those cases, E2E may still be active but you’re compromised by the human equivalent of the age old analog hole.
Jane sends unknown message to John
John fits the target profile of topic C
John gets shown ad for topic C
---
coincidentally, Janes unknown message was about topic C. Likely because she too believed it to be applicable to John.
---
Going further...
Maybe topic A was trending in Janes area. Jane's friend Sally got ads for topic A. sally talked to Jane about topic A. Jane searches topic A
A trend emerged where people who searched topic A, very soon afterwards searched topic E.
Jane is very late to hear the news about topic A. She gets shown ads for topic E. Jane thinks advertisers are reading her mind because she was thinking about topic E but didnt talk to anyone about it.
But here are a few that I could find.
https://eprint.iacr.org/2017/713.pdf
https://www.researchgate.net/publication/312778290_WhatsApp_...
https://www.ijrte.org/wp-content/uploads/papers/v8i2S3/B1093...
If you look at through the lens of game theory, the employees are extremely incentivised socially, ethically and financially to leak it.
First they didn't break E2E because its very hard to do it without people knowing.
So know we are talking about a "soft break" where they search/send key words before e2e kicks in. They wouldn't be able to that without quite a few employees knowing.
Let alone those super nerds who spend insane amount of time reverse engineering these apps and spoofing network requests just to see wassup.