Top Informatica Interview Questions and Answers
What is Informatica?
Informatica is a leading data integration software company providing tools and technologies for extracting, transforming, and loading (ETL) data. Informatica PowerCenter is their flagship ETL tool, widely used for building and managing data warehouses and other data integration projects. Informatica's tools help organizations collect data from multiple sources, transform it to meet specific requirements, and load the data into target systems.
Informatica Products
Informatica offers a suite of data integration and management tools:
- PowerCenter
- PowerMart
- PowerExchange
- PowerCenter Connect
- PowerCenter Data Quality
- Other products
Informatica PowerCenter
Informatica PowerCenter is a robust ETL tool that's used for building and managing enterprise data warehouses. Its capabilities include data extraction, transformation, and loading into target systems. It enables organizations to consolidate and analyze data from many different sources.
Data Warehouses
A data warehouse is a central repository of data designed for analytical processing and reporting. Unlike operational databases focused on transactions, data warehouses focus on providing historical and consolidated information to support decision-making and business intelligence.
Using Informatica in an Organization
Informatica is used to solve data-related challenges such as:
- Data Migration: Moving data from legacy systems to new systems.
- Data Warehousing: Building and populating data warehouses.
- Data Integration: Combining data from various sources.
- Data Quality Management: Improving data accuracy and consistency.
Informatica Workflow
An Informatica workflow is a sequence of tasks that execute mappings (data transformations). Workflows define the overall execution flow of your data integration jobs. Workflows are created using the Informatica Workflow Manager.
Types of Transformations in Informatica
Transformations are the core of data manipulation in Informatica. They modify data and handle things such as data type conversions, data cleansing, and aggregations.
- Active Transformations: Change the number of rows (e.g., aggregations).
- Passive Transformations: Do not change the number of rows.
Data Warehouse vs. Data Mart
Data Warehouse | Data Mart |
---|---|
Centralized repository; large scope; supports complex queries. | Subset of a data warehouse; smaller scope; faster query access for a specific business area. |
Repository Manager in Informatica
The Informatica Repository Manager is a tool for managing the Informatica repository—a relational database storing metadata about mappings, workflows, and other objects. It provides tools for managing users, permissions, and the repository's structure.
Mappings in Informatica
A mapping is a visual representation of how data flows from source to target, including transformations. It defines the ETL process.
- Source Definition: Defines the source data.
- Target Definition: Defines the target data.
- Transformations: Perform data manipulations.
- Links: Connect sources, transformations, and targets.
Sessions in Informatica
A session in Informatica executes a mapping. It defines the settings (schedule, parameters) for running a data transformation job. Each session is associated with a single mapping.
Informatica Designer
Informatica PowerCenter Designer is the primary tool for creating and managing mappings. It offers a graphical interface for designing the data flow and defining transformations. Key components include the Navigator, Workspace, Toolbar, Output/Control Panel, and Status Bar.
Domains in Informatica
A domain in Informatica is an administrative unit within the Informatica environment. It's used to manage and control various aspects of the Informatica server, such as users, security settings, and configurations.
Workflow Manager in Informatica
Workflow Manager is a tool used to create and manage workflows—sequences of tasks that execute mappings. Workflows automate and schedule data integration jobs.
Workflows and Worklets in Informatica
- Workflows: Define the execution order of mappings and other tasks.
- Worklets: Reusable groups of tasks that can be included in multiple workflows.
Workflow Monitor in Informatica
Workflow Monitor tracks the execution of workflows. It provides information about job status, progress, and errors.
Types of Transformations in Informatica
Transformations in Informatica modify data as it flows through a mapping. They are categorized as active or passive, based on whether they alter the number of rows:
- Active Transformations: Change the number of rows (e.g., aggregations, filters, joins). Examples: Aggregator, Filter, Joiner, Router, Rank Transformation, Sequence Generator.
- Passive Transformations: Do not change the number of rows (e.g., expression transformations). Example: Expression Transformation, Stored Procedure Transformation.
Source Qualifier Transformation (SQ)
The Source Qualifier transformation reads data from various sources (relational databases, flat files) and converts data types into Informatica-compatible types. It's an active transformation, allowing for filtering and data cleansing.
Expression Transformation
An Expression transformation is a passive transformation that manipulates data within individual rows. It performs operations like data type conversions, string manipulations, date calculations, and conditional logic.
Sorter Transformation
The Sorter transformation sorts data based on specified columns (ascending or descending). It's an active transformation; duplicate rows can be removed based on configured options. The sorter uses a temporary space to sort before writing to the next stage.
Aggregator Transformation
Aggregator transformation performs aggregate functions (like SUM, AVG, COUNT) on groups of rows. It summarizes data, reducing the number of rows in the output.
Filter Transformation
Filter transformation filters rows based on a specified condition. Rows that don't meet the criteria are excluded from the output.
Joiner Transformation
Joiner transformation performs joins between two sources (master and detail). Join types include:
- Master Outer Join (similar to a right outer join in SQL).
- Detail Outer Join (similar to a left outer join in SQL).
- Full Outer Join
- Inner Join
Router Transformation
A router transformation routes rows to different output links based on specified conditions. Multiple conditions can be used to direct rows to various paths.
Rank Transformation
Rank transformation assigns a rank to rows within groups, often used for tasks like finding the top N records within categories.
Sequence Generator Transformation
This transformation generates sequential numeric values; often used to create unique keys or identifiers.
Stored Procedure Transformation
This transformation executes stored procedures (pre-compiled database code). It passes parameters to the stored procedure and retrieves its results. The stored procedure transformation can handle both connected and unconnected database access.
Lookup Transformation
Lookup transformations retrieve data from a lookup table (often a relational table or flat file) based on specified conditions. It's a powerful way to enrich your data by adding information from a reference table. Lookup transformations can be used in both connected and disconnected modes.
Lookup Transformation Modes
- Lookup Table: The reference table used for lookups; imported from the database or a file.
- Lookup Condition: The condition to find matching rows in the lookup table.
Lookup Transformation Actions
- Retrieving a single value.
- Retrieving multiple values.
- Performing calculations using retrieved data.
- Updating slowly changing dimension (SCD) tables.
Union Transformation
Union transformation combines data from multiple sources, similar to UNION ALL
in SQL. It does not remove duplicate rows; a Sorter transformation with the "Select Distinct" option would be needed to remove duplicates.
Update Strategy Transformation
Update strategy transformations specify how rows should be handled in the target table (insert, update, delete, or reject). You can configure the strategy at the session level (applying to all rows) or at the mapping level (applying rules to individual rows).
Source Qualifier (SQ) Transformation Tasks
Source Qualifier transformations perform several key actions:
- Joins: Join data from multiple tables (usually using primary and foreign key relationships).
- Filtering: Select specific rows based on conditions (using
WHERE
clauses). - Sorting: Order data (using
ORDER BY
). - Distinct Rows: Select only unique rows.
- Custom SQL Queries: Allow writing custom SQL for data retrieval and manipulation.