
Jose Carlos Almeida Santos
José carlos santos is a computer scientist at microsoft corporation. He holds a phd from imperial college in machine learning. He taught at imperial college and has extensive experience in search engines, inductive logic programming, and other fields of computer science. He has received an award from microsoft and holds two us patents in computer science.
Academic Background
Ph.D., Computer Science, Imperial College, London, 2010
• Thesis: Efficient Learning and Evaluation of Complex Concepts in Inductive Logic Programming
• Area: Machine Learning
M.Sc. (1st year of the 4-year PhD programme), Bioinformatics, 2007
• Project: Predicting anti-cancer molecule activity using machine learning algorithms
M.Sc., Artificial Intelligence, Universidade Nova de Lisboa, Portugal, 2006
• Thesis: Mining Protein Structure Data
Licenciatura (5-year degree), Informatics Engineering, 2004
• The "Licenciatura"degree consisted of ≈ 50 semestral courses providing a solid
background in the main areas of Computer Engineering and Mathematics.
Work History
Principal Software Engineer - Microsoft Poirtugal - January 2018 - present
Integrating GPT models into various Microsoft product features, e.g. tone changer in SwiftKey Microsoft AI keyboard.
Worked on price insights team where we built a service comparing product prices of tens of millions of products across thousands of retailers.
Improving relevance, engagement and Daily Active Users (DAU) on Bing for Commerce.
Senior Software Engineer - Microsoft London Search Technology Center - January 2012 to December 2017
Worked on Bing’s query formulation module to improve query formulation time and overall user success in the search engine. Some of the areas I worked include:
Suggestion histogram mining, cleaning, processing and ranking.
Contextualize suggestions by location, session and entrypoint Offline metrics development as proxy for online success.
Extensive A/B tests to improve relevance as measured by key business metrics.
Development and improvement of Bing’s query rewriting module for the main European markets leading to more relevant search results.
Optimization of the CAL building pipeline. Mining of search logs.
Post-doctoral researcher at MLDC - Microsoft Portugal - January 2011 to December 2011
Post-doctoral researcher at the Microsoft Language Development Center (MLDC), working with the Query Rewriting group of the Munich Search Technology Center. The main work has been on improving the relevance of the Combined Alterations module (CAL) of Bing in Portuguese. The CAL module is responsible for expanding a query so that it conveys more meanings.
Teaching assistant of Prolog, Introduction to Artificial Intelligence and Introduction
to Bioinformatics - Department of Computing, Imperial College, London - 2007 to 2010
Prolog: 2009/2010
Introduction to Artificial Intelligence: 2007/2008
Introduction to Bioinformatics: 2007/2008
Helped students solving the course exercises. Marking course work and exams.
Software Design Engineer intern - Microsoft USA, Redmond - July 2005 to September 2005
Software Design Engineer summer intern at Microsoft in Redmond, Washington, USA. Worked at Windows Server Clusters - High Availability team helping in the development of the Cluster Management GUI for Longhorn Server. Main work was development of controls, error reporting infrastructure with Watson integration and product stabilization. Code developed was shipped with Longhorn Server (Windows Server 2008).
Junior consultant - Novabase Business Intelligence - August 2004 to June 2005
Developed a tool to Caixa Geral de Depositos (the largest Portuguese Bank), to automatize the monthly processing of hundredths of Excel spreadsheets. This C# tool programmatically called the Excel API and an external OLAP plugin to execute a set of complex operations on the workbooks. The tool required no human intervention and as of 2010 continued in production.
Development of a Java framework to extend the HTML rendering framework of Microstrategy.
Development of a Data Quality tool (for cleaning data, matching similar records, etc) to compete with Quality Stage. Responsible for the whole GUI which was done from scratch, about 15.000 C# lines and 1.000 lines in C++.
Intern
Assistant Professor at Católica Porto Business School.
Consultant at CEGEA - Consulting Unit in Management and Applied Economics.
Researcher at CEGE - Research Centre in Management and Economics.
Ph.D. in Economics at the London School of Economics and Political Science.
Research Interests
Industrial Organization, Competition Policy, Applied Microeconomics and Applied Econometrics
Programmer - Portuguese Competition Authority - July 2003 to August 2003
Development of a company merger and acquisition simulator implementing the
Cornout and Perry-Porter models. This merger simulator was done under the
guidance of Economics Professor Duarte Brito. As of 2010 the merger simulator
was still being used at the Competition Authority.
Teaching assistant of Programming I and Programming II courses - Department of Computer Science, FCT, Universidade Nova de Lisboa - 2002 to 2006
Programming I (C++): 2002/2003, 2003/2004, 2004/2005
Programming II (advanced C++): 2001/2002, 2002/2003, 2003/2004, 2005/2006
Responsible for the practical component of the courses (3-6 hours per week).
Helped students solving the course exercises. Marking course work and exams.
Awards/Honours
• 2016 - FY16 Individual Contributor Greatness award given by Microsoft.
• 2006 -Wellcome Trust Scholarship for the 4-year PhD programme at Imperial College
London.
• 2004 - Most Valuable Student award given by Microsoft.
US Patents
• 2022 - Determining Digital Content Service Quality Levels Based on Customized
User Metrics.
• 2016 - Query classification for appropriateness.
Publications
José C. A. Santos, Houssam Nassif, David Page, Stephen H. Muggleton and Michael J. E. Sternberg. Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study BMC Bioinformatics, 13:162, 2012 (open access article).
Jianzhong Chen, Stephen Muggleton, and José Carlos Almeida Santos. Learning probabilistic logic models from probabilistic examples. Machine Learning, 73(1):55–85, 2008. (UKPMC pdf).
Contests
Leader of the Caparica Lions team 2001 to 2004
• Represented my University FCT-UNL in 4 editions of the Portuguese Inter-University Programming Marathon, in 3 editions of National Contest in Logic Programming and in 4 editions of the ACM Southwestern European Regional Contest.
• The best results were 3rd place at PIUP 2002, 3rd place at NCLP 2004 and 14th place at SWERC 2001.
• Over 400 ACM programming problems solved individually (mainly in C) and verified to be correct through their automated judging system. These problems cover algorithms, data structures and mathematical concepts fundamental in Computer Science, such as: graph search, sorting, primality tests, congruences, matrix operations, backtracking, permutations, geometry, binary trees, tries, hash tables, etc.
Portuguese Informatics Olympiads, Member of the organization - 2001 to 2004
• Together with Professor Pedro Guerreiro and Pedro Ribeiro, I was responsible for the coaching and scientific preparation of the Portuguese team in the International Olympiads in Informatics.
• Member of the jury
Contestant - 1998
• Second place at the Portuguese Informatics Olympiads, representing Portugal in the 1998 International Olympiads in Informatics.