Today, data have become a very important asset, which were only known as a sort of appendage produced in system when clients came to deal with business. But now, business information of customers containing some of customer demands by accumulating tens of thousands of such kinds of messages are conductive to designing new products and creating new value for personalized marketing of customers. Data become an asset and needs to be managed. The scale, flexibility of data and ability to collect and use data will determine the core competitiveness of an enterprise. As long as we control the data, we can have insight into the market so as to make rapid and accurate response strategies, which means a huge return on investment. Therefore, IT department of the enterprise will be transformed from "cost center" to "profit center", and the data will become the core asset of enterprises.
Corporate strategy will change from "business-driven" to "data-driven". Digital decision making is the future development direction of enterprises. In the past, many enterprises limited analysis of their own business development only in aspect of simple summary of data and information, and lacked in-depth analysis on customers, business, marketing and competition. Decisions made by policy makers solely on the basis of their subjective and empirical evaluation on the market will result in inaccurate strategic positioning with significant risks. In the era of big data, enterprises obtain valuable information by collecting and analyzing internal and external data.By mining such information, enterprises can predict market demand and make intelligent decision analysis, so as to develop more effective strategies.
The most important aspect of big data is that it directly affects how companies or and who makes decisions. All over the business world today, people still rely more on personal experience and intuition other than data to make decisions. In a world with limited information and high acquisition costs and without digitization, it is understandable that people in high places make decisions, but the big data era lets data speak.
Yonghong provides both strategic and tactical solution.Strategically, it helps financial enterprises to establish "data-driven" development mode,
improve the data operation system and implement big data operation center.
In terms of tactics, through operational optimization, management improvement, risk control and other applications,
it comprehensively promotes financial core value and competitiveness.
Facing the challenge of big data, financial enterprises should establish "data-driven" development mode, improve data operation system and implement big data operation center. In terms of tactics, comprehensively improve core value and competitiveness of finance through operation optimization, management promotion, risk control and other applications.
Figure 2: Construction architecture diagram of big data operation center of the bank.
The top priority of the construction of the big data operation center of the bank should be focused on the three construction goals of operation optimization, management improvement and risk control, which is embodied as follows:
1. Operation optimization centralized on user data comprehensively improves operational efficiency through customer portrait, precision marketing, product optimization, public sentiment analysis, market and channel analysis.
2. The management promotion guided by input-output and value contribution really achieves refined management through performance assessment,leader’s cockpit, management accounting platform and other applications.
3. It uses multidimensional safety judgment and finer-grit modeling and forecasting to realize applications such as SME loan evaluation, real-time fraud transaction analysis, anti-money laundering business analysis, etc., and strengthen identification, evaluation and forecast for risk of commercial banks to effectively guard against financial risks.
Figure 3: Architecture Diagram MPP Data Mart of Yonghong Tech
From the data source to the final presentation, it covers the following several layers:
ETL layer: It uses PC server as ETL front-end processor to clean, transform and load data.
Offline analysis and computing platform: It uses Hadoop distributed storage to support for structured and unstructured data storage and facilitate scale-out when data size increases. It is capable to process data of the storage layer, carry out data model calculation according to the analysis needs to dig and analyze large-scale batch computing tasks in low timeliness.
Real-time online analysis platform:It uses the Yonghong highly-performed MPP data mart as the medium. MPP distributed data mart supports high concurrency and high availability, and every data mart helps to well prepare detail data for light modeling on the basis of a topic, and the data are distributively stored on each node and well backed up at the same time.The data are efficiently compressed according to the way of column storage, labeled and stored on disk. When query computation required, the memory calculation is used to calculate data, and each robot node will calculate at the same time, and the result will be presented at the application layer.
Application layer: it provides self-service analysis tools with Yonghong agile BI to visually display the data from offline and online analysis platforms. Both end users and IT developers can access the BI system through a major browser, and users can access the system via a mobile terminal. BI system provides system monitoring, multi-level authority management, multi-dimensional data analysis and other functions, and supports self-service report design and data analysis as well.
Comprehensively facilitate financial institutions to effectively explore the value of dataSolutions of high cost performance, low TCO, agile, self-service,
exploratory multidimensional analysis, high availability and high concurrency .
Effectively enhances cross-selling, investment management market shares and ability,
and cultivates their own information core competitiveness.
The entire system architecture eliminates the idea of scale-up that is common in traditional systems, whether data mart or BI front end support scale-out. With the growth of corporate business, the demand for data analysis will grow significantly, so that the platform architecture based on X86 PC Server cluster is very critical. Under this architecture, we don't have to purchase expensive minicomputer to support high concurrency to prop up mass data calculation or support data analysis and business development, but buy several ordinary PC Servers to set up clusters and construct highly cost-effective analysis platform.
Data layer agility: the data layer does not need data pre-summary calculation. The traditional architecture requires aggregating data in advance according to all dimensions that can be taken into account as well as the required indexes, or compute in advance by Cube computation method. But the agile BI approach is to simply correlate data, for which imported data are still detailed data, and all calculations happen in real time when the user clicks. Therefore, it is enough for the data layer only to build a lightweight model to import the detailed data newly required.
Application layer agility: with a flexible ROLAP mechanism, SQL will be created in real time for requirement by each click initiated and sent to the computing layer for calculation, which is easier to adapt to business changes. The number of module levels is small, therefore, report and Dashboard can be directly designed or exploratory analysis can be carried out after modeling completion, which is easier to use for end-users as well.
The mart based on the topic semantically translated the physical table structure into a logical structure that is easy to understand, so that the end-users can self- define report or dashboards at ease by dragging over them.
Interaction and analysis capabilities of front-end system: filtering, drilling, scaling, correlation, transformation, dynamic computing, links, and so on. By identifying the problem, users find the answer and make a business decision, resulting in an exploratory analysis.
Offline and online analysis platforms are in distributed architectures. Data storage is distributed, and data computation is distributed in addition to backup mechanism and monitoring mechanism. When one machine goes down, the other machine automatically assumes all calculations. The analytical computing platform is widely used and still stable and reliable even when the data size of some telecommunication customers reaches hundreds of T. This distributed data mart supports hot plug extensions for computing and storage nodes. It can be extended from one node to dozens or even hundreds of nodes.
The online analysis platform supports high concurrency. As a computing layer, the data mart supports distributed computation and uses MapReduce architecture to improve computational efficiency. BI front-end can directly connect to Oracle or Hadoop, but it is not recommended to use Oracle or Hadoop to support highly concurrent OLAP system. Because Oracle is stored in a row type, it can support high concurrency in OLTP systems, but fails to support highly concurrent OLAP system; and Hadoop system, as a cost-effective storage system, is not suitable for real-time analysis system. The distributed data mart of Yonghong Tech is stored in the column type by good memory computing technology, which can work in parallel with computing nodes on the basis of multiple storage, very suitable for real-time analysis of mass data.
-- Zhang Xinsheng, IT science and technology department of CITIC Bank