Retail Analytics – Entities and modules

Here I list some domain model entities, processes, measures and dimensions that are heavily used in the retail sector and related domains. Purchasing: Purchase Orders: Quantity vs. cost. Ordered vs. Received merchandise. Times of ordering, expected time of delivery, actual time of delivery. This is by Supplier. Marketing: Prospects, Customers, VIPs, Staffs, Segments, Marketing campaigns,[…]

Graphs with Big data

Vision. Spark.ML. MLLib. Data frame. SVD. Import.apache.spark. CRIM: Centre recherche.. Spark. Group by reduce by Key. Spark is not a hdfs hbase Cassandra. Gît. Microbatching. Why graphs: Web semantics. Communities. GraphLab. MS. Graph Construction. Post-processing. Triangles. Graphx: Vertex n edge tables Val relationships rdd.vertex(string string) SC.Paralallize(array(= RDD.edge() Val graph=graph() Save Vertices.saveAsObjectFile. Usecases:Page rank. Triangles. Shortest[…]

BI Roles & Responsibilities Matrix

Here are the roles and responsibilities of BI specialists as a matrix:  Role  Responsibilities  Qualifications Senior Data Analytics Strategist – Responsible for developing data analytics and measurement strategies to facilitate fact based planning and decision making at the City – enable service improvement projects. – A technical leader and subject matter expert for data and analytics.[…]

Domain Model knowledge of industries

Pharma:  good understanding of Pharma landscape specially understanding relationship between GPOs, IDNs, Group Practices, Hospitals, Clinics, Payers and Physicians and their attributes Should have experience with Pharma data sources such as AMA, IMS, Symphony, HCOS, CMS, Clinical Trial etc Experience in Pharma MDM Data warehouse environment is desirable. Finance: portfolio management; portfolio accounting systems, financial[…]

SAP Business Objects

Business Objects: A set of SAP tools that help build a data analytics solution: – Data Integration: BO Data Services. – Formatable reports: Crystal Reports. – Dashboarding: Xelisius. – Data Exploration: Explorer. – Ad-hoc reporting: Web intelligence. – In-depth analysis: BEx Analyzer, an excel component. First experience with SAP BO: Creation of a universe. A[…]

Big Data and Security – Cloudera

Security issues to consider for securing the data: What can be accessed by who when where from. Authentication. Authorization. Encryption. Key management. Identity management system. Cloudera has a distribution of Hadoop that contains advanced security features, serving two objectives: 1. To protect the data contained in Hadoop cluster. 2. To analyze stream data detecting where security might have[…]

Data Hub

A data hub is a centralized location for data, a special case of a data lake, where data are well structured, homogeneous, and with high reusability: serving data in multiple formats from multiple sources and to multiple potential destination. Multiple data hub architectures exist: The Publish-Subscribe Data Hub The Integration Hub The Operational Data Store (ODS)[…]

Software Development Life Cycle of Machine Learning Projects

Software Development Life Cycle of Machine Learning Projects is split into two phases: R&D: Preprocessing: Business objective and rules R&D: BPM. UML. Use Case. etc. Data R&D. Data Profiling. ETL: Extraction. Cleanup. Integration. Transformation. Aggregation. Solution Model R&D: Understand the problem and the solution required (Classification, number forecasting, etc.) Use different techniques, get the most accurate[…]