Sunday, October 16, 2016

EU GDPR will be there but how to start the journey - Chapter I

How to understand what is happening in our network

Understanding and monitoring the network traffic as such might be enough but it is not. You can use Solarwind, IBM, HP and others network monitoring tool but how much they support and utilize machine learning to understood the user behavior instead of protocols and packet size, source and target address at the Ethernet and IP level not forgotting the OSI model 4 liker TCP or UDP and application layers.

So having and saving logs is not enough in the future, instead those are important part of the process but there should be more proactive and real time actions alerting based on abnormal user behavior.

@Microsoft Advanced Thread Analytics video

To understand what is abnormal behavior, machine learning is one key word to find and identity changes and this works multiple ways. Most of the security breach still and unfortunately starts from poor user credentials and user behavior where attacker has got the user name and password somehow - and quessing the user name is not hard work. Does user have email - Yes, does user have SIP address -Yes - conclusion might then be that SMTP address = SIP Address = User Principal Name - not hard or what?

When attacker has the user name and password and can remotely login to user PC the world is open to him or her. This hacked account is the key to start to browse the corporate network and find workstations and servers/services with know vulnerabilities without installed fix. This way the attacker start to know more more from your network and unfortunately your own IT knowing nothing.

Microsoft Advanced Thread Analytics is one tool for this while it start to learn how users behave and alerts then from abnormal behavior. Example Jack is working in London Office from 8-17 and suddenly some day there is normal logging in that time but also different time from different PC and locations. So with Machine Learning you can start to find this patterns and hopefully when having alert have also control actions and governance how to proceed.


Next article will discuss from Azure AD premium and how it support the GDPR journey.


Use the following links to check the video from ATA by Microsoft

Microsoft ATA video

"All comments, thoughts and picture's are my own"

Fast and Rusty Beetle - Extreme CarShow Helsinki 2010


GDPR - The Law Part 0

General Data Protection and Regulation in nutshell from wikipedia




  1. Applies registrar and processors based on EU but also global organization if they process personal data from EU residents witch.
  2. Each member country must have organization and authority to work with the organization, EU and residents
  3. Data Protection Officer DPO, new role required for companies over 250 employees. Under the GDPR, the independent Data Protection Officer (DPO) will be under a legal obligation to notify the Supervisory Authority without undue delay and this is also still subject to negotiations at present.
  4. Data Breach and notifications
    1. 72 hours or if high risk then as soon as possible
  5. Sanctions 
    1. 2% or 10 000 000,00 Euros from global turnover witch ever is greater
    2. 4% or 20 000 000,00 Euros from global turnover witch ever is greater
  6. Law explain the mindset of personal data but as described in earlier or later chapter it can be quite blurry with extension based on age, religion, color and so on.
  7. rights
    1. rights to be forgotten
    2. Data portability - should these mean that my data in Instgram should be movable using drag and drop to Twitter or Facebook or vice versa. Maybe not :-)
    3. My Data - right to see what personal data is stored from me

To Be Continued...

"All thoughts and pictures are my own and I don't have any legal background"

Extreme Car Show 2010


Saturday, October 15, 2016

GDPR Part IV - Align with your digital journey

So,

we have storage and we have data and we have the users but how the GDPR and Digital links to gether?

Easily, the answer is quite simple in pragmatic way. In digitalization, users will have tools based on their role in the organization as explain earlier. Sales person uses CRM and R&D persona use PLM and so on.

So the future system should support couple of things from wide angle. (just believe there much more..)

1. Automate the user process, make the work easier with the tools witch support the work and not the system. So what the automation do, is that something is done always same way and it can be monitored and explained what has happened. Workflow is great example. When user tag the file for draft version or final in meta data, there can be two different workflows. Rule what will automatically delete the draft file after 6 months and second rule witch requires to way approval to move the document to secure data repository or asset place, Idea is that it is searchable and available for others witch actually explain couple key thing.

2. Meta Data and Classification. Legacy storage and file servers does not support in sophisticated and required level these and usually file server is just file server. Lots of data, weird folder structure, maybe some security related issues, data in wrong place and so on. You just name those.

In 201x what is the demand for file server in central data centers and or in branch offices while everyone are speaking from cloud and and access from anywhere. And compared  on premise solution to certified cloud solution or services, good question is that are we compliance, how our data center services will be analyzed against laws and regulations. Are we able to show reliable audit trail what has happen, who has done and what and when this has happened and unfortunately how long ago this has actually happened.

3. Workflow - Like Meta Data and classification one valuable thing is that we automate simple thing with workflows or software robot's, make thing this interested. When we try to define the workflow we are going more to process and how people work witch if awesome.

Understand how we would work can be only understood by doing it or interviewing people who know it. But there is also other side, resistance for change and staying in as is. That's where there is a need for people who through interviews understand the as is but using their knowledge from emergency technologies and digitization brings the value and ideas how this manual work can be automated, what kinf of value it creates and how the benefits can be achieved and realized - without talking anything from the products.

4. Process - as described earlier one key thing is to know how we work and also see who we should and would work and behave in the future.


5 Audit - this is nice Pandora's box witch has multiple layers and views and here we open just small part of it, in anykind of order.

5.1 User Identity starting from the process how the users are created who manage personal data covering all from Identity point of view, create, modify, delete, communicate... Not to forgot the privileged accounts and group membership management.

5.2 Authentication / login and the audit trail or detect and control. Nevertheless if the data is in the on premise or cloud, one question is how you can detect abnormal behavior. From the user profile or personalization view we know that Jack works from 7-16 sitting in the office and having desktop on the table. This is the place where he sit and try to sell products to the customer in CRM solution with Personal data included from the users. What if Jack is logging from other site of the world or Jasmine is trying to log on from Jack's PC at 03:45.

What kind of governance, process, logging and communication and action capabilities are in the organization.

Brilliant example the Finnish Police in the news where they have internally detected Police Officer to read data from one dead skier Mika Myllylä years ago. Those police were not involved to that case anyhow and not even located and lived in the same municipality (not fully sure from this), so from their daily work - no demand to look this case information. Now they were in the news when the jury was made the decisions with quite light penalty. News did not mentioned how the detect this illegal data breach but key here is that the was a control in place. We can always ask that can those person work as a police any more because they might also look the data from their neighbors and use the power given by their work wrongly.

As said - from the audit trail point of view organization should have good logging systems with proactive capabilities and governance how to proceed when abnormal behavior detected.

But - again - this also creates another data repository for personal data when we start to think this from end to end.

5.3 Logs and how to handle those. I have earlier mentioned from the Proxy w/o authentication and key here is do we understand if there is personal data in any logs from client through network to application / business logic / data base layers and identity and authentication. If said Yes then there is a demand to think a little bit more how to work with this, Can we remove all logs - and if yes did we miss data and broke the end to end audit trail. So we need to deploy a solution how we archive logs and ensure that only authorized person have access to those. And also have technical solution protecting from example data deletion to hide other criminal actions.

5.4 Data - this is wider while due integration and connection it is more difficult to say how apps will connect and how the handle the data cross applications.  One sure thing is that together with the proactive authentication monitor we also must monitor user usage of data regardless if data is stored on managed or unmanaged way.
So we need to log all access both failed and successful and what I have observed that this is not in acceptable level.  Also discussed why monitoring the failed is not enough while this does not collect the data if user with technical rights allow's the access but the user should not use that data source based on his role in the organization. Good example is if normal IT person can have access to Financial or and POS systems using his basic work account.

So logging is required to see if security breach has happened but is that enough - No unless you have process and person to look through all logs. There are also tools and services witch can be used witch actually uses machine learning to recognize the abnormal behavior.

5. Network - Network is in change but it does not reduces it's criticality, instead the traffic will be more from client SaaS services through internet instead of internal MPLS network. This turn around the whole network mindset, where we are protecting our internal network where all the services are to more how we connect and where we connect and also how to reduce the cost. Key question here is how and from where the users connect, where their services are hosted on and more going to cloud, more we need to thing name services and how to ensure the fastest connection to the client. As said 'geoDNS will have huge impact to the overall DNS architecture where your client have had the DNS IP address pointing to the Active Directory DNS witch has then forwarded the DNS query to corporates public name server in one location. Meaning that user from US Ottawa will use the public DNS in Helsinki and all responses given back to client will point Europe. Example for this is Office 365 where in this case client will get the IP address of MS service from Dublin and will connecty trough internal MPLS network to IBO in Europe. This is nothing wrong but if client in Ottawa could use geodns and get local MS service IP address and have local IBO the client will connect to the nearest Microsoft front end Exchange and then the data from Dublin will be manage by Microsoft witch release all email traffic from MPLS to the Internet.

Other example is VPN usage, when services goes more and more to cloud and spread all over the world the tight security and VPN usage start to be an issues. Now encrypted traffic will be routed and encrypted in VPN through on premise gate witch can mean that user located in Hotel in Dublin must take VPN tunnel to US concentrator witch actually then routed the traffic back in MPLS to Europe and IBO in Helsinki or from US to Internet - not cool at all.

So moving from legacy to SaaS and digitalized the work, might and usually have impact also to network, routing, network services and make the transition even more complex and unfortunately costly too. Moving from 3 regional IBO's to country based IBO's with geodns is not one day activity, instead requires hard design and implementation where organization own capabilities and their service providers capabilities are the key.

But time to go and all good so far.

To Be Continued...

"All opinion, thoughts and pictures are my own"
Car Show Lahti/Finland October 2016


Wednesday, October 12, 2016

GDPR III and beyond

Thinking is good but also painful in IT - or is it?

Should we start from bottom up or top down when thinking of GDPR ==> Information ==> DATA ==> and finally from storage where the data has been saved in history and will be saved in future and hopefully with retention policies and archiving.

If we go back to root and ask why we have storage the answer is should be clear - we want to do business and without business there is no process to create data and demand for strorage. This should be clear for all but when we take some perspective and look outside the IT might still define what is the storage architecture used for everything and it has worked earlier but today, it's not so obvious anymore and IT need to discuss more with business to understand it's demand and how IT can bring new ideas and be enabler rather than ongoing cost.

What if we turn the idea upside down and start to think what kinf of profile we have in the organization like:
  • Finance
  • HR
  • Sales
  • Communication
  • Training 
  • IT
  • R&D and product development
  • Procurement
  • ...departments...
- where each organization silo or department or business units uses their own application and creates data in their required format witch can be totally separate what other units will manage. Parallel the change from on premise to public cloud and SaaS services has spread the corporate's data - not only one data center any more with full control -  to multiple location regardeless if it has been beyond own IT's capabiliteis to offer and deliver required services or business decission.

Nevertheless the data mass is growing, it's format is extend from traditional static file to audio and video files and formats but who actually design where these should stored, how they should be available for the end users and how long and to whom and how long they are valid. Sounds like it has something to do with governance, policies, meta data blah blah blaah, and still the questions where to save these data and have clear, measurable benefits from them....

I asked my self multiple time how the GDPR and this topics mirrors to storage and answer might not at all and sametime from everywhere depending of the data content, does it include personal data, is it business data and is it valuable business data, is final version or draft version, is it searchable, is there  retention polices to delete information and data when there is no legal reason to save the data anymore with the question what is the value of the data.

And liked or not,  we come back to the basic questions of who owns the data and who creates data / information. Where the answer is Business and Sales Person managing opportunties in CRM and sending approved Proposals to the customer based on RFP as an example. If we simplify even more and start to split the tasks smaller and smaller part to understand what kind of information is handled in RFP response, we can quite easily find to type of data; structured data (customer information like anddress, contact persons, sales activities calls and emails, campaings...) managed in CRM solution and unsturctured witch is actually the result or deliverable - The Proposal.

Sounds simple, it can be or not depending of the business and the size of it.

Let's open the RFP response process. Sales person creates new  opportunity to the CRM, maybe with workflows to get intenal approvals to even start to work and staff resources and create the BID team.

The Bid team is like small project where the BID manager is responsible from the schedule and deliverables witch are usually printed or electric documents based on RFP requirements. The work requires experts and SME's able to create the solution, estimate the solution workloads and components, estimate the schedule, create the finance, define what is in the scope and out of scope, what are assumption. All these needs to be, usually, approved by business that yes this is what we want to sell and is what customer is asking in the RFP, by delivery - yes we delivery this in the presented time windows and resorces, by finance - yes all the financial like FX's, invoicing cycle, Internal fundings are align and by legal - yes from legal point of view we are OK.

Simple, but thinking this small project and data created, itj's not only the customer RFP answer file, instead it is bunch of excels, drawnings, technical documents just name it. Now we can ask a questions from us, where are we going to store these files, and sorry but even before that aks that how would we work and manage versions, how we share files, how each person now what is the current version, what are additional material and what if we need restore some part we already deleted. If we make this even more complex that all the BID team resources does not work in the same office, it will increse the internal cost to get people to work in same place unless....

OMG, question again. How good our current CRM solution supports collaboration and communication during the BID work?
  • Brilliant - all the way all communication and collaboration features available from one application / service
  • Good - some minor issues like lack of  IM, Share or comment or review features as an example 
  • None - we can manage customer and upload document, send emails from client with preformated emails but our CRM role is for Customer relationship  and sales activities but not creating document.

What was your answer?

Same way when you are using online web shop to buy a book, the system does not include writer and printing press or forklifts moving the boxes, it includes only the customer data and the sales items and orders ==> The end product it self, not draft, not forklifts, not paper, not ink ==> the end product.

So based earlier user profile we can identify data stored to two different location based on the nature or the information:
  • Managed Data - Data in CRM system (Microsoft Dynamics 365, SalesForce (might be some others too :-)) like customer name, address, contact persons, opportunties, status of those, value of the opportunity, signed proposals send to the customer and hopefully signed contracs too with terms and conditions
  • Unmanaged Data - Data saved in SharePoint Online (are there other competitive solution available with end to end integration to communication and analytics...) like Word, Excel, PowerPoint, Visio, AutoCAD and other files with version history (major-minor), meta data and data classification, workflows and so on not forgetting the search capability. We have offer this kind of services or product earlier lets find cases and use copy and past the reduce the time for proposal and parallel to benchmark the price. And these also with offline capabilities with automatic syncronization and sharing capabilities.
  •  
Based on earlier we start to talk about digital workplace and digital work where user can work from anywhere, use any devide and like approve the final version using phone or tablet, edit the same document version at the same from different location - All features not usually available from CRM.

As said earlier each business units has different demands and while thinking to upgrade the infrastructure one good thing is to analyze each application and service of how they use storage, what requirement they have from infrastructure based on user demands. Summarize those to undestand the big picture and then with innovation and digital on mind start to find the solution. Even that it might have bigger impact than moving data from old storage to new without any change.

As said starting from business view, moving to user profiles and understanding they daily work and information they need or create is pragmatically quite valuable and should drive the future roadmap to digital workplace.

Still keeping mind the lesson from my grandpa - the poor can not afford to buy cheap - meaning you have to buy two - first the cheap and then the more expensive.

To be Continued.....

"All opinions are my own"















Tuesday, October 11, 2016

GDPR to be continued II

Thinking...

When we think about GDPR and user data we can share the data to managed data with application, business logic and data base layers usually and then unmanaged data witch is more data in File servers or SharePoint or any document management data where the data is handled more like a concreate units like file with lots of words rather than smaller part of information as part of the larger information like address in CRM solution from one individual customer.

If we focus to unmanaged data we open or at least in my mind come millions of questions like governance, retention and archive policies, Meta Data, data format, data content, where to save and what data, what kind of storage architecture we have to support the type and value of the data, what kind of security policies we for data, do we classify data, do we protect data, do we have any kind of data lost prevention / protection mechanism in place, how do we recognize security breach (we don't) how do we detect and control if any security issues or un normal user behaviours in place, how do we backup and restore data, how do archive backups, how do we handle security logs if any, do we know how many dublicates we have, do we know how many movies or cat or dog pictures we hae in our system with x amount of copies, does our user know what and were to save and what type of data and maybe one hardest questions how our search works, how much our employees use time for searching something what they know to be stored somewhere but what could not be found and how much we have data from what we don't know at all.

Funny thing is that organization spends globally huge amount of time = money to find data from the data mass where part of the data is unvaluable and not required to stored based on any laws any more. Reason has been just if if if if if - you might heard that if aunt have b..s she is uncle or that if cows fly or if kids has guns and so on.
All these impacts to end user experience and efficiency witch impacts negatively to users and organizations productivity and the shares and dividends organization is paying.

All these should be seen as part of the organization digital workplace journey where the data is available, searchable, have value and old data will retire and be deleted or archived. The data is not seen as a data it is seen as asset and value for the organization

I will be the CEO or chairman of the board this would a one thing I'm interesting at least that moment when some one told that if worst case happens we have to pay penalties of 4 % or 20 million euros from total annual worldwide turnover - witch is higher.

So from where to starts is great questions and have multiple approach how to start - but without boards engagements and commitments to this why to start while you cannot achieve and realize what you are looking for. You must have the commitment from highest and get the most senior people to understand the background cross organization units; Business, HR, Finance, IT, R&D, Production... you name it.

But let's get back to unmanaged data where might be the most of the unknown information from where the organization don't have any understanding.

Regardless of your business, are you at the finance, production, resource, health care, high technology i assume that you have couple of file servers in branch offices, data centers, couple of SharePoint where the data migration was postponed due the poor finance and maybe separate document management system like Documentum and product lifecycle management PLM applications and based on users the data is not stored where it should be witch makes it unvaluable while not founded, too many versions and no knowledge of the latest version with right data.

One question you should ask from your business is "Do we know what GDPR means and how it impacts to us?"

To be continued

"All comments are my own"

Monday, October 10, 2016

How new EU General Data Protection and Regulation will impact to organizations

New EU GDPR will come into effect May 2018 and it will be bigger issues then expected and understood and it's the Law.

One key questions will be what is personal data, where in all systems it has been stored and for how long. Do we know where the user's personal data is, how would we are are we even able to found personal data from our data mass. Do we personal data only in managed data like in application and data bases or do we also have personal data in unmanaged data and honestly can some one explain what is unmanaged or dark data and do we have it and how much.

Short answer is that yes you have and usually a lot. Veritas used the term Databerg like iceberg - you will only see the 10% and the rest in in the Dark under your eys and understanding. It is history data, data where the policies and control has failed.

You have just deployed new tools but not migrated or deleted the old one - yes deleted. Normally corporate takes backups from end user workstation to the local file server witch is then replicated to central data center and then stored to backup tapes. And this happens for the same file in multiple users computer -- backed up the local Branch Office file server -- backed up to centralized data center and included to the backup tapes. And for sure it is also in email and pst files backed up same way as a file and email backups. Sounds familiar?

If not - I don't believe

And based on earlier, what if customer or you as an employee want to be forgotten, how do you ensure that your yearly reviews or saved proxy log files from authenticating proxy will be deleted and not restored from backup in crisis when the systems has crashed and your data come back to the system and visible. IP address is personal information here explaining that you as an individual has tried to connect from your PC to Internet regardless if the target has been against corporate policies - maybe

One questions will be pictures and all legal topics how for example you can publish pictures to web with others than you, also the pic's meta data can include the location witch actually helps to identify where the picture has been taken and who are in the picture. And what IF - I wanted to be forgotten but is that enought to delete the picture where is also other people??

Questions, questions but very hard to find answers.

But one key here is to see this as an organizaton wide issue and think what kind of business units and roles we have and what kind of data they manage. Is there personal data, if yes - do we collect only the minimum or just for sure a little bit more, do our data processors knows and have we guidelined them how to work with our data and if we dont have shared and trained them, how they know how to work. Should they make their own guides how to work with our data and does this impacts and them to registrar while trying to define the how to work - or should they just rise hands and stop working if not guided.

To Be Continued....


"All comments are my own"


How new EU General Data Protection and Regulation will impact to organizations

New EU GDPR will come into effect May 2018 and it will be bigger issues then expected and understood and it's the Law.

One key questions will be what is personal data, where in all systems it has been stored and for how long. Do we know where the user's personal data is, how would we are are we even able to found personal data from our data mass. Do we personal data only in managed data like in application and data bases or do we also have personal data in unmanaged data and honestly can some one explain what is unmanaged or dark data and do we have it and how much.

Short answer is that yes you have and usually a lot. Veritas used the term Databerg like iceberg - you will only see the 10% and the rest in in the Dark under your eys and understanding. It is history data, data where the policies and control has failed.

You have just deployed new tools but not migrated or deleted the old one - yes deleted. Normally corporate takes backups from end user workstation to the local file server witch is then replicated to central data center and then stored to backup tapes. And this happens for the same file in multiple users computer -- backed up the local Branch Office file server -- backed up to centralized data center and included to the backup tapes. And for sure it is also in email and pst files backed up same way as a file and email backups. Sounds familiar?

If not - I don't believe

And based on earlier, what if customer or you as an employee want to be forgotten, how do you ensure that your yearly reviews or saved proxy log files from authenticating proxy will be deleted and not restored from backup in crisis when the systems goes down and your data come back to the system and visible. IP address is personal information here explaining that you as an individual has tried to connect from your PC to Internet regardless if the target has been against corporate policies.

"All comments are my own"


Read more https://en.wikipedia.org/wiki/General_Data_Protection_Regulation

To Be Continued