Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

LLVM intro

No description
by

Sion Berkowits

on 23 December 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of LLVM intro

LLVM
An Introduction
Framework
Virtual instruction set
Compiler
Infrastructure
BSD-licensed open source, written in C++
Reusable components for building compilers
Build static compilers, JITs, trace-based optimizers, ...
Runs on Windows, Linux, OSX, and several others
Common language- and target-independent IR
3 consistent and interchangeable representations
Text based
Bit code (Binary compression)
C++ memory representation
End-to-end compilers, using the LLVM infrastructure
Libraries
Optimizations, Code generators, JIT compiler, Garbage collector...
Tools
Assembler, debugger, linker, code generator, modular optimizer...
Front End
Optimizations
Back End
High level
Language
LLVM IR
CLANG: C / C++ / ObjectiveC / OpenCL
Others: Python, Ruby, Java, many more...

...Easy to write!
LLVM IR
LLVM IR
Function inliner
Instruction combining
Loop unrolling
Vectorizer
Rotate Loops
Basic Alias Analysis
Basic CallGraph Construction
Dependence Analysis
Dominance Frontier Construction
Dominator Tree Construction
Simple mod/ref analysis for globals
Count the various types of Instructions
Interval Partition Construction
Induction Variable Users
Lazy Value Information Analysis
LibCall Alias Analysis
Natural Loop Information
Memory Dependence Analysis
Post-Dominance Frontier Construction
Post-Dominator Tree Construction
Detect single entry single exit regions
Scalar Evolution Analysis
ScalarEvolution-based Alias Analysis
Aggressive Dead Code Elimination
Promote ‘by reference’ arguments to scalars
Basic-Block Vectorization
Profile Guided Basic Block Placement
Break critical edges in CFG
Merge Duplicate Global Constants
Simple constant propagation
Dead Code Elimination
Dead Argument Elimination
Dead Store Elimination
Deduce function attributes
Dead Global Elimination
Global Variable Optimizer
Global Value Numbering
Canonicalize Induction Variables
Function Integration/Inlining
Insert instrumentation for edge profiling
Combine redundant instructions
Interprocedural constant propagation
Loop Invariant Code Motion
Delete dead loops
Extract loops into new functions
Loop Strength Reduction
Tail Call Elimination
Strip Unused Function Prototypes
Strip all symbols from a module
Code sinking
Simplify the CFG
Simplify well-known library calls
Sparse Conditional Constant Propagation
Scalar Replacement of Aggregates (DT)
Demote all values to stack slots
Reassociate expressions
Remove unused exception handling info
Partial Inliner
Unify function exit nodes
Merge Functions
MemCpy Optimization
Promote Memory to Register
Lower SwitchInsts to branches
Lower atomic intrinsics to non-atomic form
Unswitch loops
Canonicalize natural loops
Also easy to write!
Machine code
LLVM IR
Supported targets:
X86
ARM
PPC
MIPS
C
Many others

... Not so easy to write :-(
Written by:
Sion Berkowits


With A lot of data taken from
llvm.org
and others

define i32 @foo(i32 %a, i32 %b) {
entry:
%cmp = icmp eq i32 %a, 0
br i1 %cmp, label %if.then, label %if.end
if.then:
br label %return
if.end:
%sub = sub i32 %a, 1
%add = add i32 %b, 1
%call = call i32 @foo(i32 %sub, i32 %add)
br label %return
return:
%retval.0 = phi i32 [ %b, %if.then ], [ %call, %if.end ]
ret i32 %retval.0
}
Unlimited virtual registers -
Any string, start with %
Floating point



Integers



Pointers



Vectors



Conversions
RISC-like instruction set
- Most ops work on registers
- Explicit Load, Store
- Explicit ops on structures
- Most ops take all basic types
... And all vector types
LLVM-IR Language is
SSA
Single Static Assignment
- Each variable is assigned exactly once
- For example:
-
%a = add i32 %b, %c
- The value of
%a
has a 1-1 correlation with
the "
add i32 %b, %c
" operation
But what happens here?

if (cond)
x = a + b;
else
x = a * b;
foo (x);
br i1 %cond, label %if.true, label %if.false
if.true:
%x.1 = add i32 %a, %b
br label %end.if
if.false:
%x.2 = sub i32 %a, %b
br label %end.if
end.if:
%x = phi i32 [%x.1, %if.true], [%x.2, %if.false]
call void @foo(i32 %x)
sion.berkowits [at] gmail.com
Class Value
Type type;
Class User
Class Instruction
- The most basic class in LLVM.
- Many things are values:
constants, arguments, instructions, functions
- All values are typed
- Any value that uses other values
- For example instructions, Functions, etc.
- Holds a list of the used values
- Common base class for all
LLVM instructions
- Due to SSA, Every instruction
IS A value.
Class
AllocaInst
Class
LoadInst
Class
StoreInst
Class
CmpInst
Class
ICmpInst
Class
UnaryInstruction
Class
GetElementPtrInst
Class
FCmpInst
Class
CallInst
Class
SelectInst
Class
PHINode
Class
BinaryOperator
C++ Memory representation
* Partial list
Compilers
Libraries
Tools
- Clang (C/C++)
- MacRuby
- Jade (JIT vodio decoder)
- ...Many others
- Use any component/s from
LLVM in your program
- Transformation libraries
- Target-specific code generators
- IR encoder, decoder
- Interpreter, JIT compiler
- Others
- LLVM Assembler,
Disassembler
- Optimizer
- Code generator
- Bugpoint test
reducer
- LLDB
- ...
Summary
- LLVM is a modern Open-source compiler
infrastructure
- Modular C++ structure
- Easy to learn and use
- Rich tool set allows for easy
development of extensions

- Popular
- Used by many leading companies and
universities
- Gradually replacing GCC

- Not only a Compiler
- Vehicle for testing and developing
optimizations
- Useful for writing tools

- Bring your own ideas!
Start here:
http://llvm.org
Low Level Virtual Machine
Strongly typed
language
Supported types:
-
half
(16 bit)
-
float
(32 bit)
-
double
(64 bit)
-
fp128
(128 bit, 112 bit mantissa)
-
x86_fp80
(80 bit x87)
-
ppc_fp128
(128 bit, ppc)
%x = fmul double %a, %b
8388607
= 2^23 - 1
Supported types:
i1 i8 i16 i32 i64 i128
But also:
i3 i17 i52 i298 i8388607
There is no notion of unsigned type.

Arithmetic operations require all variables of the same type
%res = udiv i35 %val1, %val2
Pointers can reference almost all types
(excluding Void and Label types)

Pointers may have an optional address space attribute
Semantics of address spaces are target dependent
%data.1 = load i32* %addr.1
store float %f_data, float addrspace(7)* %f_addr
A vector can consist of any amount >0 of variables, from any Integer, FP, or Pointer to Int/FP type

< #elements X elemType >
All arithmetic and memory operations work on Vectors. Additional operations exist to build/break vectors
%res_v = fmul <16 x double> %a_v, %b_v
Due to strict typing, only explicit conversions between types are allowable
-

trunc, zext, sext

- Between different integers
-

fptrunc, fpext

- Between different floating points
-

fptoui, fptosi, sitofp, uitofp

- Integer <-> Floating point
-

ptrtoint, inttoptr

- Integer <-> pointer
-

bitcast

- Any type to any type, of the same bit width
Works on Vectors as well
For example:

x = a + b;
foo(x);
x = x + c;
bar(x);
%x.1 = add i32 %a, %b
call void @foo(i32 %x.1)
%x.2 = add i32 %x.1, %c
call void @bar(i32 %x.2)
According to wikipedia: "...simultaneously simplifies and improves the results of a variety of compiler optimizations, by simplifying the properties of variables".

Simplifies:
- def-use graphs. Every variable is initialized only once.
- Lifetime of variables is easily discovered

Benefit to optimizations (subset):
- Dead code elimination
- Register allocation
- Constant propagation
- Global value numbering (CSE)
Why SSA?
Module
Function
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Function
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Function
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Basic Block
Instruction
Instruction
Instruction
Instruction
Full transcript