Meet the expert
As a data management company, it’s important that we address up-to-date concerns surrounding data security. We’ve asked Richard Harris Jr., our wonderful mentor and colleague some questions regarding data governance, cybersecurity, and the future of analytics.
RHJ currently runs the data department for a mid-size creative and performance agency, Union, in North Carolina. From implementation and engineering, to insights and reporting, he ensures the overall quality and success of the Data and Analytics Department.
His early career was spent in advertising agencies working exclusively with marketing and website data. After mastering this field and even creating his own agency in Oregon, he moved to Texas to begin working as a senior data analyst on the implementation side of data (checking accuracy, performance, implementation, storing).
When asked about transitioning to a new field, he justified; “When you feel like you’re the smartest or have nothing to gain, that’s when it’s time to move on.”
What keeps you interested in data and analytics?
“Data is hard. It’s one of those fields that every day; it’s a new challenge, new solution, and new workaround to put in place. There are a lot of theories in this field. You can sit and analyze one metric a thousand different ways. I enjoy testing trends, coming up with hypotheses and understanding if they were successful. If there’s one thing, I don’t love how thankless it is.”
Is data governance becoming more of a challenge with the way the data and analytics space is transforming?
“Data has to be both available and implemented correctly, and it’s getting more complicated to do so with new laws and regulations for how data can be stored. Since the General Data Protection Regulation (GDPR) passed, a hot topic has been how you store, manage and get access to particular data. It’s only getting harder as specific platforms are making specific regulations for how their data can be pulled.”
“We constantly have to be aware of the new things coming up, especially now as it pertains to what data we can collect. Because this data is becoming so marginal, it’s tough to look at the current data and have confidence at what’s being looked at. A lot of people passing these laws and regulations don’t understand how data had been accessed before, which I thought to be secure. They’re just passing laws and regulations that limit data.”
Some of the best practices of data governance are things like strong internal communication, clearly defined roles and treating it like a marathon instead of a sprint. Are there others that you’d recommend?
“Data is a weird field where yeah, it’s a marathon, but you have to sprint your marathons.”
“Because things are changing, you have to be ready to pivot at a short notice. Data collection is a long process. It’s about being agile and able to take on ad-hoc requests. You have to be in the know as it pertains to data governance and data collection. Keeping up with popular content Google puts out, down to participating in forums and groups on Facebook and LinkedIn to understand how other people are coming across limitations.”
Resources for keeping up with data and analytics trends:
“The best place to keep ahead and see data in practice is in these forums and groups. At the end of the day it’s a constant sprint. You have to have endurance to work in this field.”
Have you noticed a shift in who’s paying attention to data and analytics?
“A lot of people who have been in it (D&A) see these changes being made, and don’t get why they’re happening. Generally for data experts in the world, these changes feel unnecessary and useless, but that’s not coming from a user perspective.”
“Users now feel more secure. There’s a juxtaposition between customer experience, customer satisfaction and people who work in data and their confidence with it being accurate. The main thing we know about these changes is that they make our jobs more complicated.”
Duplicate, incomplete and non-trustworthy data are all major concerns in data governance for healthcare. What can be done to prevent these issues?
“The biggest piece of data collection that gets overlooked is implementation. People think you can set up a data implementation platform, and think things will be easy and clean. But the most time should be spent on this piece to set up what you want to collect.”
“A strong administrator and implementation strategy helps you avoid incomplete and non-trustworthy data concerns. When coming to duplicative data, that’s the critical piece. Duplicative, incomplete, and non-trustworthy data can be stalled by strong implementation. It takes having a developer, implementation specialist (data analyst who understands implementation), and product manager who can bridge the gap between the data and technology pieces.”
“The question is; how can we convince companies who have never worked with data to invest in 3 individuals to make this a clean practice?”
“There are two ways to look at data over time. Historically people have been treating it like CRAP– Collect, Report, Avoid, Postpone– when it should be seen from a CARE– Collect, Analyze, Recommend, Experiment– perspective. We’re trying to get to a proactive point where we turn CRAP data into CARE data.”
Cyberattacks and data breaches are reported as the most common data challenges. Have you faced something like this in your work?
“I have not faced data breaches or cyber attacks in any companies that I’ve worked with. Not because we are or aren’t doing something, I just think it’s all about luck. Every company that stores, maintains and collects data is subject to a cyberattack.
What can be done to avoid these types of challenges?
“Try to avoid these challenges by using platforms with strong security. For example, Amazon Web services and Google Cloud platforms have strong security teams; use platforms that you know have a team for you as you are signing up.”
“Even then, very likely there are cases where your data is exposed. Prepare for how you respond to them (ex. password systems) vs. where you stop them. At the end of the day, all of our data will be ‘breached’ in one way or another. There are 12 year old kids hacking companies for fun, just to see if they can. A website I helped support in a previous role got taken down not by a data breach, just by someone interested in taking down the site. Breached data does not mean an absolute threat to the company.”
“Transparency is key; it builds user confidence. If something happens, be transparent about what happened and what was accessed, and what they can do to prevent anything further from happening.”
Tips for anticipating a data breach:
- Keeping an eye on credit reports (if SSNs were leaked)
- Empowering users to know what someone is doing with their data
- Educating them to take care of this information when something gets accessed
For instance, in 2017 Equifax announced a data breach that exposed the personal information of 148 million people. The company responded by giving those affected settlement options, such as free credit monitoring services.
What cybersecurity challenges should we be ready to face, and what are the hard questions we’re going to have to answer?
“It’s hard to anticipate what breaches you’re going to face. It depends what data you’re collecting about specific individuals. In health-tech; priorities are adhering to the Health Insurance Portability and Accountability Act (HIPAA), and having a Business Associate Agreement (BAA) in place for how we’re managing protected health information. Employees need to go through HIPAA compliance and regulation.”
“Leaking user health data, contact information, and access to our platforms are all potential. Stay on top of who has access, when and for how long.”
Questions for data security in health-tech:
- Are you HIPAA compliant?
- How many people have gone through HIPAA certification in your company?
- Why do you need our data? What are you going to provide for us?
- What kind of exclusions are you making to data used to ensure we’re only collecting what’s absolutely necessary to do the work to support us?
- Do you have a security/data security person on staff who can help respond to leaks quickly?
- What platforms are you using to store this data?
- What kind of security processes or VPNs do you use?
- How long do you need access to this information to do the work provided?
- How long have you been doing this?
- Have you been asked one of these and not known how to respond?
Do you have any thoughts about the advancement of machine learning and artificial intelligence becoming popular in recent years?
“Definitely. It is the coolest thing that has come into the data world. Predictive analytics, data modeling; it’s what people are most interested in.”
What can you tell us about predictive analytics?
“The process is long to find a model that fits when you’re predicting data. To build an accurate model about a user or company, you need at least 2-3 years of clean data to build based on recent trends. If you didn’t have a strong implementation practice 3 years ago, you’re not going to be able to build accurate models for 2-3 years from now. Data scientists spend 80% of time cleaning/cleansing data sets until one accurate model can be built (one of those things that is a marathon).”
“Predictive analytics is an industry by industry trend. We’re using machine learning to analyze what has already been done before, and talking about the end result to get people excited. The toughest piece is getting the buy-in; how accurate is this for us? People always want projections to be exceeded vs met. Sell into the end results.”
“It depends on how confident people feel within their own jobs to sign up for something like Theralytics. As always, transparency on how it works is key. For myself, even if it automates me out of a career, I’m cool with that and confident in the ability to go somewhere else. It’s more exciting than it is scary.”
Richard Harris Jr., https://www.linkedin.com/in/richardallenharrisjr/
Union Creative & Performance Agency, https://union.co/
General Data Protection Regulation, https://gdpr-info.eu/
“Apple expands industry-leading commitment to protect users from highly targeted mercenary spyware” Apple, 6 July 2022, https://www.apple.com/newsroom/2022/07/apple-expands-commitment-to-protect-users-from-mercenary-spyware/
“Prepare for the future with Google Analytics 4” Google, 16 Mar 2022, https://blog.google/products/marketingplatform/analytics/prepare-for-future-with-google-analytics-4/
Towards Data Science, https://towardsdatascience.com/
“DataIsBeautiful” Reddit, https://www.reddit.com/r/dataisbeautiful/
“Big Data, Data Science, AI, IoT, Cyber Security & Blockhain” LinkedIn, https://www.linkedin.com/groups/3990648/
“Data Science World” Facebook, https://www.facebook.com/groups/BigDataPakistan
“Are you looking at data from a CRAP or CARE perspective?” Richard Harris Jr., 24 June 2022, https://www.linkedin.com/posts/richardallenharrisjr_data-analytics-activity-6944969877862449152-Qdrt/?utm_source=linkedin_share&utm_medium=member_desktop_web
“Equifax says cyberattacks may have affected 143 million in the U.S.” Tara Siegel Bernard, Tiffany Hsu, Nicole Perlroth and Ron Lieber, 7 September 2017, https://www.nytimes.com/2017/09/07/business/equifax-cyberattack.html
Equifax Data Breach Class Action Settlement, https://www.equifaxbreachsettlement.com/admin/services/connectedapps.cms.extensions/18.104.22.168/a4f6125d-1f25-4e2c-aa90-3f2bb20e811f_1033_EFX_-_Long_Form_Notice.pdf
Health Insurance Portability and Accountability Act of 1966 (HIPAA), https://www.cdc.gov/phlp/publications/topic/hipaa.html#:~:text=The%20Health%20Insurance%20Portability%20and,the%20patient’s%20consent%20or%20knowledge.
“What is a HIPAA Business Associate Agreement (BAA)?” 28 April 2017, https://healthitsecurity.com/features/what-is-a-hipaa-business-associate-agreement-baa
“A data cleaning journey” Eliud Nduati, 29 July 2021, https://medium.com/analytics-vidhya/a-data-cleaning-journey-2b0146407e44#:~:text=It%20is%20estimated%20that%20data,Data%20cleaning%20enhances%20data%20quality.