Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET core should not SPY on users by default #6145

Closed
ghost opened this issue May 18, 2016 · 212 comments
Closed

.NET core should not SPY on users by default #6145

ghost opened this issue May 18, 2016 · 212 comments
Milestone

Comments

@ghost
Copy link

ghost commented May 18, 2016

@blackdwarf @piotrMSFT I am very disappointed to discover that .NET core comes with a hidden and enabled spy utility that reports on its users. (Lakshanf/issue2066/telemetry dotnet/cli#2145). Apparently, MS has learned nothing from the backclash against Windows 10 spying on users. I suspect many will not want to install .NET core for this reason, which is a shame because .NET core is otherwise cool.

@richlander
Copy link
Member

Our recent blog post discusses the addition of telemetry to the .NET Core tools. See: https://blogs.msdn.microsoft.com/dotnet/2016/05/16/announcing-net-core-rc2/#telemetry

Me and the folks on my team are motivated to provide a great product. As you can also see from the blog post, we've made some pretty dramatic changes in RC2. We believe that they are the right ones, but we need both feedback and usage data in order to help us find all of the rough edges. Usage data tends to be more objective in the aggregate and user feedback more insightful, so we do a better job when we have both available.

The data we collect does not identify individual users. We're only interested in aggregate data that we can use to identify trends. The telemetry feature is configurable, so you can turn it on/off at any time. It is also scoped, only applying to tools usage, not the rest of the product. We think that this is a good trade-off and recognize that not everyone will like it. We do know, however, that many people will like the product improvements that will come from this insight.

We intend to share the data. The presence of it will do a lot to define the scope of data. It will also give the community access to the same insight we have. We very much feel that improving .NET Core is a shared need and task. As an example, we would welcome a PR from the community that added another telemetry data point given a strong improvement reason and no loss in anonymity.

We are separately considering opt-in runtime telemetry to learn more about crashes, GC pauses and startup time. There is no way we can get enough insight about the product without that kind of information. We are very focussed on constant improvement and will transparently do what it takes to ensure the product is compelling and competitive.

As an aside, it's been a busy week with shipping RC2 and answering questions. I haven't actually looked at this data yet and I'm one of the primary consumers. I'll be doing that today or tomorrow. I'm looking forward to sharing my insights.

@guardrex
Copy link
Contributor

@richlander

we need both feedback and usage data

Does the telemetry still include arguments provided to the dotnet command? In server hosting scenarios, some may have sensitive arguments passed to the command (for portable apps) that they wouldn't want leaked to MS.

and will transparently do what it takes

The program is not "transparent" IMO.

@terrajobst
Copy link
Member

Does the telemetry still include arguments provided to the dotnet command?

My understanding is that this was recently discussed with our privacy team and we concluded that collecting the arguments themselves (hashed or not) is not acceptable per our privacy policies. Not sure whether the code already reflects that, but it's being worked on.

@terrajobst
Copy link
Member

terrajobst commented May 18, 2016

The program is not "transparent" IMO.

What would you accept as sufficiently transparent? Not trying to say that we already are sufficiently transparent; I'm trying to understand your concern and what we could do make it better. The product is worked on by various teams who all contribute to the same open source code base on GitHub. Clearly you think that's not sufficient, so I'd like to understand what process would address that.

@guardrex
Copy link
Contributor

@terrajobst Ah! Thanks. I'm glad the arguments are safe on the server.

WRT transparency: There is no indication at the time of install that the dotnet cli is automatically opted-into data sharing. There's no checkbox that will set the opt-out env var. There's no note or link to the GH issue or a Docs page that describes the program and how to opt-out. The privacy policy merely links to the generic MS privacy policy, where there is no mention of the program.

You really have to have heard about this through the GH issues or via chat at JabbR or Slack ... or Wireshark your server I guess. In my mind, that hardly constitutes "transparency."

IMO there is a great risk here for negative PR if the mainstream media gets a hold of this issue that will not be good. It's only a matter of time before some enterprising journalist looking for a scoop picks up on this. The headlines here are not good: "Microsoft caught with sneaky program to spy on companies" ... I know ... I know ... barely accurate given what the data is, how it's shared, and it's use by the teams. You know that doesn't matter one bit when you're trying to sell a newspaper. I was a college newspaper editor. Trust me ... it will not be good if the current disclosures about this program hold to RTM.

@richlander
Copy link
Member

@guardrex This is good feedback. We do have a bit more to do to make sure that everything to with telemetry is obvious. We'll make sure that gets into the next release.

@ghost
Copy link

ghost commented May 19, 2016

GuardRex is exactly right about the lack of transparency and danger you are in for a shitstorm, so it is a good idea to include a checkbox in the installer to make it visible!

Also, you should keep in mind the problem is both privacy AND security. As for security, I think that MS forget that a power user/developer may have hundreds of pieces of software installed. If all these pieces of software (in a stealthy way) report usage back to various servers on the internet, then the security attack surface becomes so large that it is impossible to secure the computer. Hence, many companies will ban your software (especially on Linux servers).

This particular "feature" may undo all the good things that MS is doing with .NET core. Even if I might personally be persuaded to risk my computer and privacy, some of my customers won't. Hence, I will be reluctant to base my development on .NET core because of the customer reaction to spying.

@vcsjones
Copy link
Member

I would probably agree that people will be a little miffed by this. Homebrew for OS X recently went through this even though they were well intentioned, did it anonymously, and provided a way to opt out.

I think simply asking people on first use if they'd like to submit telemetry is a good start.

Consider what Yeoman does on first use:

screen shot 2016-05-19 at 10 26 05 am

I think people are generally happy to give feedback when asked.

@mmc41
Copy link

mmc41 commented Jun 10, 2016

@blackdwarf
@piotrpMSFT
@richlander
In related news, VS2015 just got into big trouble because spy code was discovered:
https://www.reddit.com/r/cpp/comments/4ibauu/visual_studio_adding_telemetry_function_calls_to/d30dmvu

You should consider learning from such mistakes!

@vcsjones
Copy link
Member

Looks like issue dotnet/cli#3404 is tracking implementing notification of telemetry.

@h3smith
Copy link

h3smith commented Jun 28, 2016

@richlander - as someone looking to deploy projects built with this is healthcare and classified environments, this creates significant challenges. An environment variable is a decent starting point, but build time and local options should also be given to ensure that this data is not collected. I appreciate the desire of you guys, but it introduces security concerns.

@akoeplinger
Copy link
Member

@guardrex https://blogs.msdn.microsoft.com/dotnet/2016/06/27/announcing-net-core-1-0/#user-content-net-core-tools-telemetry shows the data points that are collected and the following statement which should make it pretty clear that telemetry only applies to the tools/CLI (i.e. dotnet):

The feature will not collect any personal data, such as usernames or emails. It will not scan your code and not extract any project-level data that can be considered sensitive, such as name, repo or author (if you set those in your project.json). We want to know how the tools are used, not what you are using the tools to build. If you find sensitive data being collected, that’s a bug. Please file an issue and it will be fixed.

@guardrex
Copy link
Contributor

only applies to the tools/CLI (i.e. dotnet)

If you mean it only applies to executing a portable app using dotnet (dotnet .\myapp.dll) and not a self-contained app using corehost (myapp.exe) ... I don't think the language states that clearly. One has to know that you don't consider corehost to be a "tool," and that's not an assumption that I would make.

There is an on-going problem in assuming too much prior knowledge in communication with people (outside of the ASP.NET docs, where a major effort has been made to address this problem). I think writing docs with greater attention to explicit and comprehensive explanations, as annoying and time-consuming as that may be, clears up a great deal of confusion.

Setting this minor confusion aside, I greatly appreciate the effort that has been made to inform everyone about the telemetry program. I still wish that production servers weren't automatically opted-into the program, mostly because (just like @blackdwarf commented in a recent video interview I saw) I hate having to set and maintain env vars on servers ... a total PITA IMO.

@dlebedynskyi
Copy link

dlebedynskyi commented Jun 28, 2016

Guys, this issue is really important. A lot of projects ask for telemetry and it is ok. In fact for a bunch of those like yo, bower and so on dev like me willingly opt in.
But not asking using if he even want and referring to some elua that really no one will read smells. It is horrible negative PR.
Make option to opt in for use. Explain in details what you are going to collect and what not. Do not do it by default.
Otherwise we really will have to block feature or not use dotnet entirely. I really don't think that paranoid security team will even allow devs to deploy this now.

@kspeakman
Copy link

Discovering this telemetry has put the plans I had in using .NET Core back on the drawing board. You are essentially refusing to accept an arms-length relationship by including telemetry. Data leakage is a risk even if it isn't user specific. It also creates attack opportunities since attackers now have this plentiful and predictable avenue of communication to go after. Not to mention that once marketing gets wind (if they didn't help drive it in the first place), the data collection will be expanded. Save yourself work by creating more admin/security work for your users (to opt out or block telemetry). Just because it's an industry trend doesn't mean its a good thing to do. </3

@guardrex
Copy link
Contributor

guardrex commented Jul 1, 2016

@kspeakman On the bright side, it is well controlled by the env var ...

/~https://github.com/dotnet/cli/blob/rel/1.0.0/src/dotnet/Telemetry.cs#L39-L44

... so at least if you add that via web.config, PowerShell, manually, or whatever ... it disables telemetry effectively. However, if you were more generally concerned about Microsoft.ApplicationInsights being on the server at all, then they have said that corehost doesn't have telemetry built-in, so you could go the self-contained app direction (no shared framework on the server) and avoid this entire issue. The only catch is that you need to pull the ASP.NET Core Module out and install that manually ... they don't have a standalone installer for the module yet (AFAIK), nor has it been spun off into OSS yet (but they are planning to do that).

@RomanShumikhin
Copy link

Is there any chance that this telemetry "feature" will be removed from the next version of the tools?
If not, I totally agree with the original poster, this should be opt-in, not opt-out.

@mschlechter
Copy link

At the very least, the dotnet program should ask on first run whether the user wants this or not.

First Windows 10 and now this.

I don't want telemetry. At all. It's fine when people are beta testing a product in a special testing environment, but not in production.

@linkdata
Copy link

Making this opt-out instead of opt-in seems like really poor judgement. I understand and respect the need for you to collect some usage to help guide the .NET Core platform, but printing a few lines of text once before starting to send unspecified data over the 'net to some server is just disrespectful.

Please make it opt-in or remove it entirely.

@ghost
Copy link

ghost commented Sep 23, 2016

I vote to remove it fully from .Net Core source code.
It must be an external option, user should have ability to download some package to start statistics collection.

@ghost
Copy link

ghost commented Nov 26, 2016

Since there hasn't been any post on this topic in a couple of months, I will share some insights having just come across this as a fresh (potential) adopter of Core CLR.

Just downloaded latest build of .NET core and just by luck noticed the unremarkable disclaimer after running one of the dotnet shell commands.

This is altogether ridiculous and comes on the heels of already-rediculous telemetry collection in their other products. I feel like Microsoft is saying publicly that they're not tone def to the community then they keep doing things like this.

I was starting to get excited about the implications of Core CLR and what that could mean for the expansion of C# (the language itself is really fantastic).

This automatic telemetry nonsense is a big reason why I shy away from the Windows platform entirely. It's not even about my own personal feelings or beliefs on privacy concerns and whatnot. It's about selling this platform to my company and my contracts. In an enterprise environment, getting people to trust Microsoft is already an uphill battle with many with my fellow developers and higher-ups. Making the case for using C# + Core CLR on Linux is MUCH easier than making the case for switching entirely to Windows.

However, this telemetry nonsense is simply a nonstarter. Imagine trying to sell this to someone already averse to monoliths and vendor lock-in (synonymous in our field with MS, for better or worse) then immediately having to defend telemetry collections (and the disablement thereof). We run production workloads in production data centers with enough infosec headaches already. Things like this are simply nonstarters for many executives. Sure, we can add an environment flag, but when has someone EVER forgotten to do that?

Alas, I am beginning to feel that Core CLR will go the way of Windows 10: admittedly great technology crippled by corporate nonsense that makes many developers just go look for some alternative when choosing a tech stack that doesn't come laden with such nonsense.

TL;DR; turn this crap off. You're pissing off the people you claim to be building tools for.

@CodesInChaos
Copy link

How about checking HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Windows\DataCollection\AllowTelemetry and disabling telemetry when it exists and is 0 in addition to your product specific opt-out? Users who don't want Windows telemetry almost certainly don't want .NET telemetry either.

@ghost
Copy link

ghost commented Mar 16, 2017

@CodesInChaos, .Net Core can run on Linux...
And I personally think some flag wherever is bad idea... as well as any assumptions...
User should be able to control all in obvious way.

@vcsjones
Copy link
Member

vcsjones commented Mar 16, 2017

@HardHub

in addition to your product specific opt-out

CodesInChaos is suggesting that if the platform is Windows and they opted out of Windows telemetry, then just assume "no" telemetry. Otherwise, allow the user to make a selection.

@ghost
Copy link

ghost commented Mar 19, 2017

@vcsjones

I see... And I mentioned what Linux guys should do? And no Windows Enterprise 10 users?
I think some registry key somewhere is not good idea for user privacy... not enough obvious.
It should be available very easy and disabled by default... I personally suggested to enable telemetry as package. Google, for example, does not force us to use GA....

@OpinionatedGeek
Copy link

In the interests of transparency, please let us all know which hostnames/servers to which the data is sent.

This will also allow people to block this traffic once, at the network level, instead of having to update every RC file for every shell for every user for every machine.

@vcsjones
Copy link
Member

@OpinionatedGeek

In the interests of transparency, please let us all know which hostnames/servers to which the data is sent.

Telemetry is collected using Application Insights, to my knowledge. The documentation for their endpoints and IPs is here: https://docs.microsoft.com/en-us/azure/application-insights/app-insights-ip-addresses

@OpinionatedGeek
Copy link

@vcsjones Many thanks for that link and those hostnames. It's a very handy reference!

Can anyone from Microsoft confirm or deny that this is the full, correct list? I note that blocking all the listed hostnames would mean blocking access to hosts like login.windows.net and packages.nuget.org - hosts Microsoft probably doesn't want blocked.

Many thanks.

@vcsjones
Copy link
Member

vcsjones commented Mar 20, 2017

The ones specifically for telemetry are dc.services.visualstudio.com and dc.applicationinsights.microsoft.com. The rest are for ApplicationInsights, but aren't categorized as Telemetry.

Keep in mind this would affect any application that uses Application Insights, not just the SDK.

@OpinionatedGeek
Copy link

I'd still like to hear from Microsoft the official list of servers to which the telemetry is sent. Can someone from Microsoft please reply with these details?

The telemetry hostnames aren't in the dotnet CLI code base as far as I can tell, and I don't want to just assume that the hosts are the same as the 'Azure Application Insights' hosts, or that they remain the same across all OSs.

@dasMulli
Copy link
Contributor

dasMulli commented May 29, 2018

About GDPR: it would only violate if it collected personal data or data allowing to identify an individual.
AFAIK app insights doesn’t persist IP addresses since February so I guess this should be okay-ish (assuming no other data would be a violation)
edit: there’s also no Session data. Not sure about machine infos
edit 2: apparently hashed MAC address is sent

@mtimofiiv
Copy link

mtimofiiv commented May 29, 2018

I don't use .NET (I came here from Hackernews). But just so you know, this is the kind of thing that gives this software and your brand in general a horrible reputation. I for example use VS Code as my editor of choice and when I first read about this telemetry stuff a while back, I went through the source to check for this anti-feature being built into that too. I will not install Windows on principle because I know it's loaded with your pushy telemetry as well. Apple clearly asks you ONCE if you want to report anonymous usage statistics to them, and it's one checkbox to deal with in the setup of the machine.

It does not take a lot of effort on your part to make the choice opt-in rather than opt-out by default, and no argumentation that project contributors or anyone else made in this thread or others of this type (there are a couple issues on here if I remember correctly about the topic of telemetry) have clearly been able to show why doing so would somehow compromise your product feedback.

So listen to your users who are clearly speaking out against this telemetry (or rather the way you choose to context the collection of it). All you have to do is make it compliant to their expectations, because that's what building great user experiences is all about. I don't get why this is so hard to understand.

These telemetry GH issues demonstrate that people are passionate about your product and about their own privacy. Do the decent thing and stop jamming your telemetry down people's throats.

@Zenexer
Copy link

Zenexer commented May 29, 2018

Opt-out is still unacceptable. None of the (potential) users complaining about this are going to be satisfied unless it’s opt-in. I love .NET, and even I’m hesitant to use Core if I have to do a dance to avoid telemetry, as it presents an obvious legal and PR risk for me.

@ibakirov
Copy link

I agree, should be option to opt out this feature

@roebuk
Copy link

roebuk commented May 29, 2018

It's not just dotnet/cli that's guilty of this, Office-js is also collecting user data without explicit consent. OfficeDev/office-js#61

@kiicia
Copy link

kiicia commented May 29, 2018

@dasMulli practically any datum is considered personal data, it does not need to be ip address, even user-agent is identifying (especially with other data) - this is exact strategy that spammers use to have persistent fingerprint of you
and that's why it is opt-in - because someone must check what data is collected and what is it used for, then write it all and ask user for permission
it is exactly opposite to current strategy to collect as much data as possible for current and future use

@tadas-subonis
Copy link

@kiicia No. You have to be able to identify an individual person by the given data. Identifying and telling "it's the same customer" is not the same. Using IP, you can identify a person by asking ISP to whom does that IP belong. You can't do that with User Agent or a cookie.

@creshal
Copy link

creshal commented May 29, 2018

You're still violating the "privacy by default" principle of the GDPR, as some of these data could be PII under some circumstances (exotic UA combinations e.g.). Opt in would solve a lot of potential legal headaches.

@chrisjsmith
Copy link

Something to back this complaint up:

User's hashed MAC address is sent by default. This is consistently hashed so can be correlated across other information sources to identify a user.

/~https://github.com/dotnet/cli/blob/b45f1fb439b36872c249b07f1c6bddf20167f166/src/dotnet/Telemetry/Sha256Hasher.cs#L12

This is seriously not on.

@kiicia
Copy link

kiicia commented May 29, 2018

@tadas-subonis it depends on who you ask, there are internal corporate trainings/policies which clearly state what is considered personal data and what is not (consulted with/interpreted by lawyers) and it seems that any literal datum "produced" by client is automatically user data no matter how trivial it may seem
also some companies downplay/ignore certain details because it is their modus operandi to gather and use all available data
there is however clause about "data which are necessary to use/work with/deliver service" - for example you may require certain data because you are unable to deliver service otherwise... then consent can be implicit by sole decision to use service - it is also used to explain why certain amount of data must be gathered, still you need to say something to user/client and give them chance to express their decision

@bhartvigsen
Copy link

I have to admire Microsoft's consistent use of the word "telemetry" across all its various privacy-violating platforms and practices.

@WenJianhub
Copy link

微软现在可以说是操作系统一家独大,所以说有时候可能会忽略掉一些用户在意的问题。对于.Net Core 收集用户信息这个问题我的想法是 —— 这个问题不应该再有过多的争议,显然用户是对微软的收集方式不满。那么微软最直接、最合理的做法不应该就是在统计用户信息之前给用户一个右好的提示和选择;并将收集范围告知用户,给予用户完全的选择权吗?最后的结果也很简单:1.微软采纳用户合理意见 2.微软不采纳用户意见,首先微软需要对用户提议有一个明确的态度!

@dandago
Copy link

dandago commented Jul 24, 2018

Microsoft doesn't give a toss what the community wants. Open source for them is just a way to get code improvements and documentation into their systems at other people's expense.

@Meyhem
Copy link

Meyhem commented Sep 21, 2018

Why has the opt-out has to be so hard ? Setting one long env. variable on every process ? It seems microsoft is making quite an effort to provide UX that obscures anything it doesn't want you to see.
Wouldn't it be nicer to get prompt during install, not just info message about crime already comitted ?
Or at least telemetry-less packages ?

@svick
Copy link
Contributor

svick commented Sep 21, 2018

@Meyhem

Setting one long env. variable on every process ?

Every OS I know lets you set the environment variable once in a way that applies to every process.

@chrisjsmith
Copy link

This isn't a technical discussion. We can set environment variables.

We are asking for opt-in with explicit consent, not opt-out.

This is a sign of respect for your userbase, something which we demand after years of abuse.

@chrisjsmith
Copy link

There's a fine example of where opt out telemetry goes to crap...

@dasMulli
Copy link
Contributor

dasMulli commented Dec 3, 2018

@kstarikov I think you could/should create a new issue for this

@dgl
Copy link

dgl commented Aug 23, 2020

I sent a security report to Microsoft detailing how the collected data is not anonymous, but they didn't consider it a security issue. The write-up is now publicly available here: https://dgl.cx/2020/08/ms-report.txt

Given the fact it is easy to accidentally send data to Microsoft I have written up how to send a right to object request here: https://dgl.cx/2020/08/dotnet-sdk-gdpr

@Kein
Copy link

Kein commented Jan 7, 2021

Why would it be anonymous? Making it anonymous invalidates its value. User data is, essentially, money, and if you can't associate it with an individual - it is waste of money. They will never make it anonymous, it simply makes no commercial sense.

@Bert2Go
Copy link

Bert2Go commented Aug 31, 2022

Linux's philosophy is open source and privacy. If you are a developer and create a product for Linux you adhere to that philosophy. Microsoft, you are a guest in the house of Linux and the first thing you do is breaking this very rule. Shame on you Microsoft. I am ok with collecting data, if the user makes the decision and agrees to it! You have your own Operating System where you can abuse your users any which way you see fit. I made the conscious decision to leave Windows for that reason.
Technically if you enroll the user by default you should compensate them for their time and information you use, to make your own product better.

@nickpreston24
Copy link

Microsoft doesn't give a toss what the community wants. Open source for them is just a way to get code improvements and documentation into their systems at other people's expense.

You're absolutely right. So, let's keep dragging Microsoft kicking and screaming into Linux, please, so we can all get a break from their abuse. If they want to earn our hard-earned cash, they should learn to innovate and abandon 90's Microsoft's "divide and conquer" strategies or anything related to it, this telemetry issue included. There's already plenty of tomfoolery they are doing, as it is, so complaining about telemetry is not at all unwarranted. In fact, the video I linked brought me to this discussion.

If Microsoft does the right thing and shifts their entire OS over to having, say, an Ubuntu kernel, then and only then will I consider buying Office 365 (KDE Edition, lol).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests