The Internet belongs to everyone. Let’s keep it that way.

Protect Net Neutrality
Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Presentación Jornadas Datamining 2015

No description
by

Ricardo Pasquini

on 11 October 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Presentación Jornadas Datamining 2015

Ingeniería en
Software /
Datamining

• La Facultad Ingeniería (FI) y el IAE se juntan en un proyecto que busca mejorar el uso de tecnologías de información para la investigación en negocios.

El marco del proyecto
Quienes somos
Startups, Users,
Investments, Jobs
Followers, Following, mentions,
use of #, contents and places
Education, experience
1. Tangela
NodeXL
Analisis Multivariado- Econometría
Visualización Dinámica de Estadísticas Descriptivas
• Para cada momento de extracción de datos (por ejemplo, mensual) estadísticas de:
Número de startups existentes por localización,
Número de emprendedores (founders),
Número de inversores (ángeles y fondos).
Nuevos startups,
Nuevos founders,
Nuevos inversores.

• Startups:
Mueren (quizás delistados)
“Zombies” (no delistados pero quizás no activos )
Exits (Compras por parte de una empresa)
• Una red será de inversiones: Si la persona A invierte en el Startup S y la persona B es un fundador de S, entonces A y B están conectados por una relación de inversión.

• El link de la red podría ampliarse a otro tipo de relaciones, incluyendo: advisor, board member. Followers y Following

• Queremos que las redes sean dinámicas, es decir, poder reconocer las nuevas conexiones en base a un campo de fecha.
Bases en forma de red
Colectar toda la información de financiamiento ya obtenido.
Startups que están haciendo public fundraising, poseen info especial de términos de su oferta, (valuación). Luego podríamos saber como les fue.

Inversiones y Public Fundraising
Investigación en Negocios
Conocimiento
Aplicado
1. Fuentes específicas
1) Generar indicadores sobre ecosistemas emprendedores y su financiamiento a nivel global, que nos permitan monitorear el fenómeno y su evolución.

Actualmente se utilizan fuentes como GEM Global Entrepreneurship Monitor.
Proveer indicadores y visualizaciones de actualización en tiempo real.

2) Responder preguntas de investigación:

¿Cómo deben usar sus redes de contacto los emprendedores para conseguir financiamiento?
¿Qué importa más para el éxito de un startup? ¿Su equipo emprendedor? ¿Sus redes de contacto?
¿Cómo fomentamos el ecosistema emprendedor en LATAM

3) Implementación de modelos predictivos: startups exitosos, conexiones recomendadas etc.
Ricardo Pasquini

Carolina Dams
Virginia Sarria Allende

Gabriela Robiolo

Alumnos FI
El Proyecto de Financiamiento de Startups y Networks
Objetivos Generales

Fuentes de Datos
Nuestras Apps: Tangela hoy
Startups.
Users (Founders, Investors, Mentors, Lawyers..)
Startups Funding (rondas de inversión)
Startup Network (startups conectadas por personas en común)
Users Network (users conectados por startups en común.
Colleción de datos de principalmente usando APIs

Posibilidad de exportar bases de datos:
Se obtiene por país (y estado), período, calidad del perfil, etc un conjunto startups o usuarios luego:
Librería de Análisis
http://home.uchicago.edu/~craigtutterow/index.html
Linkedin: ¿Qué nos interesa?
Datos generales (Name, job title, company, industry)

Experiencia (positions)

Educación

Contactos
https://developer.linkedin.com/docs/fields
https://developer.linkedin.com
Linkedin API
Aplicaciones- Ejemplos
Próximo Objetivo:
* Estudio de factibilidad

La API podría no proveer toda la experiencia de 3ros sino solo la última posición
Solo los contactos del usuario registrado
Podría no proveer educación
Resources
Sirius. Empresa
incubada en FI
"Financiamiento de Startups y Networks: Un proyecto interdisciplinario"
Objetivos de mediano/largo plazo

Generar capacidades para la investigación sobre fenómenos sociales utilizando información digital de gran escala:
ciencias sociales y de negocios computacionales
.

Antecedentes: CSS Research Center @Stanford, 2015 CSS Summit, 2015 International CSS Conference


Oportunidades para la generación de conocimiento interdisciplinario, por ejemplo:
Nuevas fuentes de datos posibilitan testear hipótesis que previamente no era posible.
Modelos predictivos fundados en teoría.

Ignacio Nuñez
Javier Isoldi
Alejandro Ciatti
Investigación teórica y empírica en emprendedorismo y
financiamiento de emprendimientos (entrepreneurial finance)
Coleccionar datos de la Web (Web mining),
Análisis de datos (en grándes volúmenes y en redes).
2. Fuentes complementarias
Colección y ordenamiento
Aplicaciones de colección
3. Shelob
Análisis
Bases de datos
orientadas a grafos

3. Web pages
4. Funds and Deal
s
News
SeedDB
Datos en formato Red
El Proyecto de Financiamiento de Startups y Networks - Fuentes Apps y Herramientas
These network questions can be associated with the
value-added question:
"Which is the value-added of investors?


S
I
i) Experience, advice and directions
ii) Connections
iii) Signaling
Networks
Entrepreneurs should attract the most experienced, reputed, and best-networked investors.
Should they?
Value-added comes at a cost.

H1 :
A higher degree of founders networks will reduce networking value-added by investors when networks overlap.

New hypothesis examples
Ex. S1 has more chances of being invested by 3 than S7 has of being invested by 8
H2:
Greater overlapping networks will increase the chances of matching a reputable investor


Ex. The networking value-added of being invested by 1 or 3 should be roughly the same
Value-added and matching questions are connected
Endogeneity problem:
Individuals in blue, Startups in red
Each link is characterized by a type of relationship r={F,I,A}, corresponding to founder (F), investor (I), and advisor (A).
Methodology
Representing the entrepreneurial finance ecosystem in a network
Individuals' network
Companies' network*
Networks Representation
Measuring Performance
Financing status at different time points. Arrows illustrate all paths a startup could have gone through.
Model 1
Model 2
Transition probabilities:
Disentangling investors' networks effects
Performance
Investor Network
(experience, or reputation)
Explicit modelling of matching to control for endogeneity.
Degree: Number of connections

Closeness: Inverse of average distance




Betweeness: Proportion of number of shortest paths

Networks Indicators at the node level
:
“The Added-Value of Network Connections in Entrepreneurial Finance”


Pasquini and Robiolo (2015)
In entrepreneurial finance the question of
who
finances a project is vital
Sørensen (2007): higher number of completed deals by an investor (as a measure of his experience), the higher the chances that the invested startup will get into an IPO.
Fewer research trying to properly identify the added-value that investors’ connections bring to startups
Hsu (2004) finds that those offers by investors having completed more deals are also more likely to be accepted.
Evidence
Hochberg, Ljungqvist, and Lu (2007).
Networks of syndicated investments.
Then more central a venture capital fund is in a network of investment syndicates, the higher the chances their invested startups will survive into a subsequent investment round. .
Understanding the value-added of investors, particularly in terms of network resources, is still limited and deserves a deeper understanding.
Literature ignores the potential impact of other types of connections that are vital for entrepreneurs, such as those provided by advisors or startups’ founders
2. There is a number of investors with a very high number of connections. Will these connections always add a proportional value to startups?
Time constrains

Centrality and distance to resources
Hypothesis

1. Best connected startups (i.e., startups with higher network centrality) achieve higher performance in terms of total funds raised.
2. Omitting the networks of founders, advisors and other roles induces a significant omitted variable bias.
3. Omitting the network in common between investors induces a significant omitted variable bias in value added estimation.
4. The relationship between connections and performance is positive and concave, therefore adding new connections adds less value beyond a certain point.

Gaps

1. When prescribing which investors to match , for example, it is pending a consideration of the startup existent network of connections.
A related problem that emerges when associating the networking value of investor to the total number of deals the investor has completed is not to take into consideration the connections that prospective investors have in common.
Descriptive Statistics –Network Indicators - California Startups
1. Estimation of the Value-added of Networks on Ventures Performance
Total fundraising at time t since seed.
Shelob hoy


En desarrollo


Permite la extracción y compilación en base de datos de perfiles de listas de emprendedores, en particular su educación y experiencia profesional.
Rastrea links existentes y busca perfiles cuando no existen links.

Conditional on the amount of time that has elapsed and having survived until that time- those startups with better connected (i.e., more central) networks raise a higher amount of funds.
An increase in one standard deviation of the number of edges (i.e., approx. 280 edges) is related with increase of 53% in total fundraising.
Baseline specification
6 months and 1 year temporal lags
Increasing the closeness indicator in one standard deviation (approximately 0.05 points in the indicator) increases total fundraising approximately 80%.
We estimate added-value regressions by disentangle startup connections in those from founders, advisor, and other key roles (such as board members or lawyers) separately.





We estimate the magnitude of the resulting bias in our data.
Biases When Omitting Networks of Founders, Advisors and other Roles
startupn=investorn+foundersn+advisorsn-commonn
As a result, the estimation of the added value of investors is slightly biased in 2% points.

The intuition for this is related to the relatively low number of connections due to founders, advisors and other roles relatively to the number of connections of investors.

Being the connections provided by investors those who drive the total connections of the startup, it is reasonable to conclude that the biases are not large when omitting the rest of connections.
The problem of ommiting the networks of founders, advisors and other roles is that we will tand to overestimate the effect of networks of investors.
Estimate of Bias
where
Investors Connections vs. Startup Connections
We disentangle the effects of those connections from other connections that do not add centrality.

We start by differentiating the concept of total number of connections from investors from the total number of new connections that they bring to the startup.
The bias implies a difference of approximately 20%, a considerable difference in economic terms.
Estimate of Bias
Is Always More Connections Better?
We test the hypothesis of decreasing returns to new connections by searching for a concave relationship between edges and performance. There should be a point above which new connections add less value.
Overall, we conclude that the relationship between number of edges and totalfundraising is concave, and adding new connections do not always add performance, thought this occurs above the 5th quantile in the distribution.

The results are in line with the hypothesis that investors might face a constrained amount of time that impedes exploiting their networking assets beyond a certain threshold.
Performance Variables
I. El proyecto IAE-FI y las ciencias computacionales en la investigación de negocios y ciencias sociales

II. El proyecto de Startups y Networks

III. Added-value of Network Connections' paper

IV. Aprendizaje técnico en colección de datos y computación.
Plan de Presentación
Aprendizajes Técnicos


I. Tangela

II. Análisis de Redes
Complejidad de datos: bases de datos relacionales y orientadas a grafos: OrientDB

Redes de gran tamaño, sufren de limitaciones computacionales.
Ejm California: Red de 300,000 vértices

Uso de librerías eficientes a gran escala.
Más Aprendizajes
Ricardo Pasquini - IAE Business School
rpasquini@gmail.com

2. Tanchella
Full transcript