Telecom Italia Brasil (TIM) is the Brazilian subsidiary of Telecom Italia, with 27% of the market, which provides a range of fixed and wireless services. The company has 74 million customers of which 84% are pre-paid wireless customers. TIM continuously invests over $3 billion yearly in Capex and is Brazil’s second-largest wireless operator. Given the large yearly investment and that the pre-paid segment accounts for the majority of the carrier’s revenue, it’s important to get the most out of the network without jeopardizing the customer experience.
Adhering to Regulations While Mining Customer Data
Brazilian government regulations require carriers to archive 90 days of wireless call details records (CDRs). TIM’s wireless network generates four terabytes of data daily from (voice, data and SMS CDRs). This data is created by 11 different servers and switches with 8-10 different record layouts; as a result, there was no one single repository to store all these CDRs. In addition, the upstream mediation system that is responsible for merging CDR records into a single record for each session, was sending duplicate CDRs or not stitching the call records completely.
In Brazil, nearly 3 million mobile subscribers are added yearly. Identifying network bottlenecks proactively is key to keeping customers from churning. TIM Brazil wanted to leverage data that it had to collect for government compliance and data from network elements to optimize its network. This was clearly a problem best answered by Big Data technology but TIM didn’t have the technical expertise or tools to create a solution.
Hiring Proven Data Lake Experts
TIM Brasil selected Zaloni to architect and build a Hadoop Data Lake to serve as the single repository for traffic-related, inventory and provisioning data (CDRs, SNMP, server logs). Zaloni was chosen because the company is a recognized industry leader in implementing and utilizing big data lakes. A data lake is a repository containing a vast amount of raw data, in native format that allows different users to analyze and manipulate that data for multiple applications.
This repository consists of a Hadoop cluster (Pivotal HD) and HAWQ, enabling scalability, high volume processing, archive and retrieval of CDR data. The Hadoop cluster is built with a load balancer that distributes data to two HA (high availability) servers capable of ingesting 1000 files per minute. MapReduce jobs remove duplicate data and reconstruct call records from partial or lost records. SQL scripts using HAWQ create summary tables for government reporting and network analysis.
Staying Compliant and Exceeding Customer Expectations
Now in production, TIM Brasil has a Hadoop Data Lake capable of ingesting 8 terabytes daily because of the enormous scalability of Hadoop and HAWQ. This architecture enables TIM to meet the immediate need for government reporting requirements thereby avoiding costly fines and penalties. More importantly, the reports built from creating a single unified repository enable a deep insight into network utilization. Network congestion can be identified in near real-time and corrective actions can be implemented dynamically through network function virtualiation (NFV). TIM Brasil has not only gained a compliance solution but can manage the network efficiently as the volume of subscriber usage grows, ensuring a great customer experience.