Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Data and Process Modeling
Transcript of Data and Process Modeling
data flow diagram
) uses various symbols to show how the system transforms input data into useful information.
data flow diagram
(DFD) shows how data
moves through an information system but does not
show program logic or processing steps. A set of DFDs provides a logical model that shows what the system does, not how it does it. That distinction is important because focusing on implementation issues at this point would restrict your search for the
most effective system design.
OVERVIEW OF DATA AND PROCESS MODELING TOOLS
Data and Process Modeling
Describe data and process modeling concepts
and tools, including data flow diagrams, a
data dictionary, and process descriptions
Draw data flow diagrams in a sequence,
from general to specific
Explain how to level and balance a set of
data flow diagrams
Describe how a data dictionary is used and
what it contains
Use process description tools, including
structured English, decision tables, and
Describe the relationship between logical
and physical models
Describe the symbols used in data flow
diagrams and explain the rules for their use
DFDs use four basic symbols that represent processes, data flows, data stores, and enti-
ties. Several different versions of DFD symbols exist, but they all serve the same pur-
pose. DFD examples in this textbook use the Gane and Sarson symbol set. Another
popular symbol set is the Yourdon symbol set. Figure 5-3 shows examples of both
versions. Symbols are referenced by using all capital letters for the symbol name.
A process receives input data and produces output that has a differ-
ent content, form, or both.
DATA FLOW SYMBOL
A data flow is a path for data to move from
one part of the information system to another. A data flow in a DFD
represents one or more data items.
DATA FLOW DIAGRAMS
• Spontaneous generation. The APPLY INSURANCE PREMIUM process, for
instance, produces output, but has no input data flow. Because it has no input, the
process is called a spontaneous generation process.
• Black hole. The CALCULATE GROSS PAY is called a black hole process, which
is a process that has input, but produces no output.
• Gray hole. A gray hole is a process that has at least one input and one output, but
the input obviously is insufficient to generate the output shown. For example, a
date of birth input is not sufficient to produce a final grade output in the
CALCULATE GRADE process.
DATA STORE SYMBOL
A data store is used in a DFD to represent data that the system stores because one or more processes need to use the data at a later time.
The symbol for an entity is a rectangle, which
may be shaded to make it look three-dimensional. The name of
the entity appears inside the symbol.
DFD entities also are called terminators, because they are data origins or final desti-
nations. Systems analysts call an entity that supplies data to the system a source, and an
entity that receives data from the system a sink. An entity name is the singular form of a
department, outside organization, other information system, or person. An external
entity can be a source or a sink or both, but each entity must be connected to a process
by a data flow.
• Draw the context diagram so it fits on one page.
• Use the name of the information system as the process name in the context diagram.
• Use unique names within each set of symbols.
• Do not cross lines. One way to achieve that goal is to restrict the number of symbols in any DFD.
• Provide a unique name and reference number for each process.
• Obtain as much user input and feedback as possible. Your main objective is to ensure that the model is accurate, easy to understand, and meets the needs of its users.
Guidelines for Drawing DFDs
Step 1: Draw a Context Diagram
The first step in constructing a set of DFDs is to draw a context diagram. A
is a top-level
view of an information system that shows the system’s boundaries and scope. To draw a context diagram,
you start by placing a single process symbol in the center of the page. The symbol represents the entire
information system, and you identify it as process 0 (the numeral zero, and not the letter O). Then you
place the system entities around the perimeter of the page and use data flows to connect the entities to the
central process. Data stores are not shown in the context diagram because they are contained within the
system and remain hidden until more detailed diagrams are created.
EXAMPLE: CONTEXT DIAGRAM FOR A GRADING SYSTEM The context diagram for a
grading system is shown in Figure 5-12 on the previous page. The GRADING SYSTEM
process is at the center of the diagram. The three entities (STUDENT RECORDS SYSTEM,
STUDENT, and INSTRUCTOR) are placed around the central process. Interaction among
the central process and the entities involves six different data flows. The STUDENT
RECORDS SYSTEM entity supplies data through the CLASS ROSTER data flow and
receives data through the FINAL GRADE data flow. The STUDENT entity supplies data
through the SUBMITTED WORK data flow and receives data through the GRADED
WORK data flow. Finally, the INSTRUCTOR entity supplies data through the GRADING
PARAMETERS data flow and receives data through the GRADE REPORT data flow.
Step 2: Draw a Diagram 0 DFD
In the previous step, you learned that a context diagram provides the most general view of an information
system and contains a single process symbol, which is like a black box. To show the detail inside the black
box, you create DFD diagram 0. Diagram 0 (the numeral zero, and not the letter O) zooms in on the system
and shows major internal processes, data flows, and data stores. Diagram 0 also repeats the entities and
data flows that appear in the context diagram. When you expand the context diagram into DFD diagram 0,
you must retain all the connections that flow into and out of process 0.
Mary Joy Atis
Data and Process Modeling
EXAMPLE: DIAGRAM 0 DFD FOR A GRADING SYSTEM
Figure 5-15 on the next page shows a context diagram at the top and diagram 0 beneath
it. Notice that diagram 0 is an expansion of process 0. Also notice that the three same entities
(STUDENT RECORDS SYSTEM, STUDENT, and INSTRUCTOR) and the same six
data flows (FINAL GRADE, CLASS ROSTER, SUBMITTED WORK, GRADED WORK,
GRADING PARAMETERS, and GRADE REPORT) appear in both diagrams. In addition,
diagram 0 expands process 0 to reveal four internal processes, one data store, and five
additional data flows.
EXAMPLE: DIAGRAM 0 DFD FOR AN ORDER SYSTEM Figure 5-16 on the next page
shows the diagram 0 for an order system. Process 0 on the order system’s context diagram
is exploded to reveal three processes (FILL ORDER, CREATE INVOICE, and APPLY
PAYMENT), one data store (ACCOUNTS RECEIVABLE), two additional data flows
(INVOICE DETAIL and PAYMENT DETAIL), and one diverging data flow (INVOICE).
The following walkthrough explains the DFD shown in Figure 5-16:
Leveling uses a series of increasingly detailed DFDs to describe
an information system. For example, a system might consist of dozens, or even hundreds,
of separate processes. Using leveling, an analyst starts with an overall view,
which is a context diagram with a single process symbol. Next, the analyst creates diagram
0, which shows more detail. The analyst continues to create lower-level DFDs until all processes are identified as functional primitives, which represent single processing
functions. More complex systems have more processes, and analysts must work
through many levels to identify the functional primitives. Leveling also is called exploding,
partitioning, or decomposing.
Balancing ensures that the input and output data flows of
the parent DFD are maintained on the child DFD. For example, Figure 5-19 shows two
DFDs: The order system diagram 0 is shown at the top of the figure, and the exploded
diagram 3 DFD is shown at the bottom.
Step 3: Draw the Lower-Level Diagrams
This set of lower-level DFDs is based on the order system. To create lower-level diagrams,
you must use leveling and balancing techniques. Leveling is the process of drawing
a series of increasingly detailed diagrams, until all functional primitives are
identified. Balancing maintains consistency among a set of DFDs by ensuring that input
and output data flows align properly. Leveling and balancing are described in more
detail in the following sections.
A set of DFDs produces a logical model of the system, but the details within those DFDs
are documented separately in a data dictionary, which is the second component of structured
A data dictionary, or data repository, is a central storehouse of information about
the system’s data. An analyst uses the data dictionary to collect, document, and organize
specific facts about the system, including the contents of data flows, data stores, entities,
and processes. The data dictionary also defines and describes all
meaningful combinations of data elements
Also called a
, is the smallest piece of data that has meaning within an information system.
Data elements are combined into
, also called
. A record is a meaningful combination of related data elements that is
included in a data flow or retained in a data store.
Using CASE Tools for Documentation
The more complex the system, the more difficult it is to maintain full and accurate documentation.
Fortunately, modern CASE tools simplify the task. For example, in the Visible
Analyst CASE tool, documentation automatically flows from the modeling diagrams into
the central repository, along with information entered by the user. This section contains
several examples of Visible Analyst screens that show the data repository and its contents.
Documenting the Data Elements
You must document every data element in the data dictionary. Some analysts like to
record their notes on online or manual forms. Others prefer to enter the information
directly into a CASE tool. Several of the DFDs and data dictionary entries that appear in
this chapter were created using a popular CASE tool called Visible Analyst. Although
other CASE tools might use other terms or display the information differently, the objective
is the same: to provide clear, comprehensive information about the data and processes
that make up the system.
Figure 5-23 shows how the analyst used an online documentation form to record
information for the SOCIAL SECURITY NUMBER data element. Notice that the figure
caption identifies eight specific characteristics for this data element.
Regardless of the terminology or method, the following
attributes usually are recorded and described in the
Data element name or label
. The data element’s standard
name, which should be meaningful to users.
. Any name(s) other than the standard data element
name; this alternate name is called an alias. For
example, if you have a data element named CURRENT
BALANCE, various users might refer to it by alternate
names such as OUTSTANDING BALANCE,
CUSTOMER BALANCE, RECEIVABLE BALANCE, or
Type and length
. Type refers to whether the data element
contains numeric, alphabetic, or character values.
Length is the maximum number of characters for an
alphabetic or character data element or the maximum
number of digits and number of decimal positions for a
numeric data element. In addition to text and numeric
data, sounds and images also can be stored in digital
form. In some systems, these binary data objects are managed
and processed just as traditional data elements are.
For example, an employee record might include a digitized
photo image of the person.
. The value for the data element if a value otherwise is not entered for
it. For example, all new customers might have a default value of $500 for the CREDIT
LIMIT data element.
Specification of the data element’s domain, which is the set of
values permitted for the data element; these values either can be specifically listed or
referenced in a table, or can be selected from a specified range of values. You also would
indicate if a value for the data element is optional. Some data elements have additional
validity rules. For example, an employee’s salary must be within the range defined for
the employee’s job classification.
. The specification for the origination point for the data element’s values. The
source could be a specific form, a department or outside organization, another information
system, or the result of a calculation.
. Identification for the individual or department that has access or update privileges
for each data element. For example, only a credit manager has the authority to
change a credit limit, while sales reps are authorized to access data in a read-only mode.
. Identification of the user(s) responsible for entering and changing
values for the data element.
Description and comments
. This part of the documentation allows you to enter additional
Documenting the Data Flows
In addition to documenting each data element, you must document all data flows in the
Data flow name or label
. The data flow name as it appears on the DFDs.
. Describes the data flow and its purpose.
. Aliases for the DFD data flow name(s).
. The DFD beginning, or source, for the data flow; the origin can be a process,
a data store, or an entity.
. The DFD ending point(s) for the data flow; the destination can be a process,
a data store, or an entity.
. Each data flow represents a group of related data elements called a record or
data structure. In most data dictionaries, records are defined separately from the data
flows and data stores. When records are defined, more than one data flow or data store
can use the same record, if necessary.
Volume and frequency
. Describes the expected number of occurrences for the data
flow per unit of time. For example, if a company has 300 employees, a TIME CARD
data flow would involve 300 transactions and records each week, as employees submit
their work hour data.
Documenting the Data Stores
You must document every DFD data store in the data dictionary.
Typical characteristics of a data store are as follows:
Data store name or label
. The data store name as it appears on the DFDs.
. Describes thedata store and its purpose.
. Aliases for the DFD data store name.
. Standard DFD names that enter or leave the data store.
Volume and frequency
. Describes the estimated number
of records in the data store and how frequently they are updated.
Documenting the Processes
Your documentation includes a description of the process’s characteristics and for functional primitives, a process description, which is a model that documents the processing steps and business logic.
The following are typical characteristics of a process:
Process name or label
. The process name as it appears on the DFDs.
. A brief statement of the process’s purpose.
. A reference number that identifies the process and indicates relationships among various levels in the system.
. This section includes the input and output data flows. For functional primitives, the process description also documents the processing steps and business logic.
Documenting the Entities
By documenting all entities, the data dictionary can describe all external entities that interact with the system.
Typical characteristics of an entity include the following:
. The entity name as it appears on the DFDs.
. Describe the entity and its purpose.
. Any aliases for the entity name.
Input data flows
. The standard DFD names for the input data flows to the entity.
Output data flows
. The standard DFD names for the data flows leaving the entity.
Documenting the Records
A record is a data structure that contains a set of related data elements that are stored and processed together. Data flows and data stores consist of records that you must document in the data dictionary.
Typical characteristics of a record include the following:
Record or data structure name
. The record name as it appears in the related data flow and data store entries in the data dictionary.
Definition or description
. A brief definition of the record.
. Any aliases for the record name.
. A list of all the data elements included in the record. The data element names must match exactly what you entered in the data dictionary.
Data Dictionary Reports
The data dictionary serves as a central storehouse of documentation for an information
system. A data dictionary is created when the system is developed, and is updated constantly
as the system is implemented, operated, and maintained. In addition to describing
data element, data flow, data store, record, entity, and process
, the data dictionary documents the relationships among these components. You can obtain many valuable
reports from a data dictionary, including the following:
• An alphabetized list of all data elements by name
• A report describing each data element and indicating the user or department that
is responsible for data entry, updating, or deletion
• A report of all data flows and data stores that use a particular data element
• Detailed reports showing all characteristics of data elements, records, data flows,
processes, or any other selected item stored in the data dictionary.
PROCESS DESCRIPTION TOOLS
A process description documents the details of a functional primitive, and represents a
specific set of processing steps and business logic. Using a set of process description
tools, you create a model that is accurate, complete, and concise. Typical process
description tools include structured English, decision tables, and decision trees. When
you analyze a functional primitive, you break the processing steps down into smaller
units in a process called modular design.
Modular design is based on combinations of three
logical structures, sometimes called control structures,
which serve as building blocks for the process.
Each logical structure must have a single
entry and exit point. The three structures are
called sequence, selection, and iteration. A rectangle
represents a step or process, a diamond shape represents
a condition or decision, and the logic follows
the lines in the direction indicated by the arrows.
The completion of steps in sequential
order, one after another, as shown in Figure 5-30.
One or more of the steps might represent a subprocess that contains additional logical structures.
The completion of one of two or more
process steps based on the results of a test or condition.
In the example shown in Figure 5-31, the system
tests the input, and if the hours are greater than 40, it performs the CALCULATE OVERTIME PAY process
The completion of a process step that is
repeated until a specific condition changes, as
shown in Figure 5-32. An example of iteration is
a process that continues to print paychecks until
it reaches the end of the payroll file. Iteration also
is called looping.
Structured English is a subset of standard English that describes logical processes clearly and accurately. When you use structured English, you must conform to the
• Use only the three building blocks of sequence, selection, and iteration.
• Use indentation for readability.
• Use a limited vocabulary, including standard terms used in the data dictionary
and specific words that describe the processing rules.
is a logical structure that shows every combination of conditions and outcomes. Analysts often use decision tables to describe a process and ensure that they have considered all possible situations. You can create decision tables using Microsoft PowerPoint, Word, or Excel.
TABLES WITH ONE CONDITION
If a process has a single condition, there only are
two possibilities – yes or no. Either the condition is present or it is not, so there are only two rules.
TABLES WITH TWO CONDITIONS
Suppose you want to create a decision table based
on the Verify Order business process shown in Figure 5-34. When documenting a process,
it is important to ensure that you list every possibility. In this example, the process
description contains two conditions: product stock status and customer credit status.
If both conditions are met, the order is accepted. Otherwise the order is rejected.
TABLES WITH THREE CONDITIONS
Suppose the company now decides that the credit manager
can waive the customer credit requirement, as shown in
Figure 5-36.That creates a third condition, so there will be
eight possible rules. The new decision table might resemble the
one shown in Figure 5-37.
In addition to multiple conditions, decision tables can have more than two possible outcomes.
Decision tables often are the best way to describe a complex set of conditions. Many analysts use decision tables because they are easy to construct and understand, and programmers find it easy to work from a decision table when developing code.
LOGICAL VERSUS PHYSICAL MODELS
While structured analysis tools are used to develop a logical model for a new information
system, such tools also can be used to develop physical models of an information system.
A physical model shows how the system’s requirements are implemented. During the sys-
tems design phase, you create a physical model of the new information system that follows from the logical model and involves operational tasks and techniques.
Sequence of Models
What is the relationship between logical and physical models? Think back to the beginning
of the systems analysis phase, when you were trying to understand the existing system.
Rather than starting with a logical model, you first studied the physical operations of the
existing system to understand how the current tasks were carried out. Many systems ana-
lysts create a physical model of the current system and then develop a logical model of the
current system before tackling a logical model of the new system. Performing that extra
step allows them to understand the current system better.
Many analysts follow a four-model approach, which means that they develop a physical
model of the current system, a logical model of the current system, a logical model of
the new system, and a physical model of the new system. The major benefit of the four-
model approach is that it gives you a clear picture of current system functions before
you make any modifications or improvements. That is important because mistakes made
early in systems development will affect later SDLC phases and can result in unhappy
users and additional costs. Taking additional steps to avoid these potentially costly mis-
takes can prove to be well worth the effort. Another advantage is that the requirements
of a new information system often are quite similar to those of the current information
system, especially where the proposal is based on new computer technology rather than a
large number of new requirements. Adapting the current system logical model to the
new system logical model in these cases is a straightforward process.
The only disadvantage of the four-model approach is the added time and cost needed
to develop a logical and physical model of the current system. Most projects have very
tight schedules that might not allow time to create the current system models.
Additionally, users and managers want to see progress on the new system — they are
much less concerned about documenting the current system. As a systems analyst, you
must stress the importance of careful documentation and resist the pressure to hurry the
development process at the risk of creating serious problems later.
During data and process modeling, a systems analyst develops graphical models to show
how the system transforms data into useful information. The end product of data and
process modeling is a logical model that will support business operations and meet user
needs. Data and process modeling involves three main tools: data flow diagrams, a data
dictionary, and process descriptions.
Data flow diagrams (DFDs) graphically show the movement and transformation of
data in the information system. DFDs use four symbols: The process symbol transforms
data; the data flow symbol shows data movement; the data store symbol shows data at
rest; and the external entity symbol represents someone or something connected to the
information system. Various rules and techniques are used to name, number, arrange,
and annotate the set of DFDs to make them consistent and understandable.
A set of DFDs is like a pyramid with the context diagram at the top. The context dia-
gram represents the information system’s scope and its external connections but not its
internal workings. Diagram 0 displays the information system’s major processes, data
stores, and data flows and is the exploded version of the context diagram’s process sym-
bol, which represents the entire information system. Lower-level DFDs show additional
detail of the information system through the leveling technique of numbering and parti-
tioning. Leveling continues until you reach the functional primitive processes, which are
not decomposed further and are documented with process descriptions. All diagrams
must be balanced to ensure their consistency and accuracy.
The data dictionary is the central documentation tool for structured analysis. All data
elements, data flows, data stores, processes, entities, and records are documented in the data
dictionary. Consolidating documentation in one location allows you to verify the informa-
tion system’s accuracy and consistency more easily and generate a variety of useful reports.
Each functional primitive process is documented using structured English, decision
tables, and decision trees. Structured English uses a subset of standard English that defines
each process with combinations of the basic building blocks of sequence, selection, and iter-
ation. You also can document the logic by using decision tables or decision trees.
Structured analysis tools can be used to develop a logical model during one systems
analysis phase, and a physical model during the systems design phase. Many analysts
use a four-model approach, which involves a physical model of the current system, a
logical model of the current system, a logical model of the new system, and a physical
model of the new system.