Arash Tavakkol| PhD


Personal information

Currently working Fortum, Zürich

Senior Solutions Architect

I'm a solutions architect at Fortum where we are working on the latest massive data processing technologies. I am also intreseted in computer systems design, especially computer architecture, systems and circuits (hardware, software, and devices), and modern data storage systems.

Research Interests

Parallel and Distributed Algorithms
Scalable Memory System Design
Multi-core and Many-core Architectures
Near-Data Processing
Big Data Analysis/Acceleration


  • 2016 - 2018

    ETH Zürich, Zürich, Switzerland

    Postdoc - Systems Group, Department of Computer Science.
    Research topics: High-Performance and QoS-Aware Memory/Storage Sub-Systems, RDMA-Based Data Replication in Modern Datacenters, Near-Data Processing
    Advisor: Onur Mutlu.
  • 2010 - 2015

    Sharif University of Technology, Tehran, Iran

    Ph.D. in Computer Engineering - Computer Architecture Major.
    Thesis title: A Scalable and High-Performance Design Architecture for Solid-State Drives.
    Advisor: Hamid Sarbazi-Azad
  • 2005 - 2008

    Sharif University of Technology, Tehran, Iran

    M.Sc. in Computer Engineering - Computer Architecture Major.
    Thesis title: Performance of Crossbar-based Interconnection Networks for Multiprocessors.
    Advisor: Hamid Sarbazi-Azad.
  • 2000 - 2005

    Sharif University of Technology, Tehran, Iran

    B.Sc. in Computer Engineering - Software Engineering Major.
    Undergraduate final project title: An image watermarking framework using discrete Wavelet Transform to protect image databases against unauthorized modifications.
    Advisor: Shohreh Kasaei.
  • 2020 - present

    Software Architect @Fortum, Zürich

  • 2018 - 2020

    Senior Software Engineer @RepRisk AG, Zürich

  • 2016 - 2018

    Senior Researcher @Systems Group, ETH Zürich

  • 2015 - 2016

    Manager @IPM HPC Center

  • 2010 - 2015

    Senior Software Architect @IPM School of Computer Science

  • 2008 - 2010

    Technical Lead @IPM School of Computer Science

  • 2006 - 2008

    Research Assistant @IPM School of Computer Science

  • 2005 - 2008

    Software Engineer @Pima Engineering Co. P.J.S.


  • 2018

    Best Paper Award, European Network on High Performance and Embedded Architecture and Compilation (HiPEAC), 2018.

  • 2016

    Third place award in the 18th Iranian National Khwarizmi Youth Festival for my innovations in storage systems, Iran.

    In news (in persian): Official web page, IRNA, ISNA, Mehr News, Hamshahri
  • 2004

    Ranked 4th in the Iranian Nationwide Graduate School Entrance Exam in Computer Engineering, Iran.

  • 2004

    Ranked among the top 15 students of the 8th National Scientific Olympiads in Computer Engineering, Iran.

  • 2000

    Ranked 188th among more than 350,000 applicants in the Iranian Nationwide University Entrance Exams, Iran


Programming Languages Java C++/C C# Python
Database Management Systems MySQL PostgreSQL MS SQL Server Redis Apache Cassandra
Multicore & Parallel Programming Platforms Nvidia CUDA OpenMP MPI Java Multi-threading Apache Hadoop Pthread Windows Threads
Web and Application Development Spring Framework React JSX JavaScript EJB Jooq JPA JSP Thymeleaf Jooq JMS Hazelcast IMDG Apache Kafka PHP ASPX .NET Framework .NET WPF
Applications and Scientific Tools Kubernetes Docker Nginx Django Node.js Elastic Search Apache Tomcat Apache HTTP Server Microsoft IIS
Operating Systems Linux (CentOS, Fedora, Ubuntu, Debian) FreeBSD Windows XP/7/8/10
Development Tools Git Atlassian (Jira, BitBucket, Bamboo, Confluence) Gitlab Maven Nexus Jenkins SonarQube IntelliJ IDEA MS Visual Studio Eclipse Shell Scripting GCC/G++ Vim PhpStorm
Scientific Tools R Matlab SimpleScalar SESC gem5 MPARM GPGPU-Sim Disksim BookSim NVSIM CACTI Orion Simulink LATEX MS Office Open Office
Digital & Embedded System Design Verilog HDL Modelsim State Flow Intel x86 Assembly


Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses

R. T. Nadig, M. Sadrosadati, H. Mao, N. Mansouri-Ghiasi, A. Tavakkol, J. Park, H. Sarbazi-Azad, J. G\'{o}mez-Luna, and O. Mutlu
in 45th International Symposium on Computer Architecture (ISCA '23)
pp. 36:1 - 36:16, 2023.

PLMC: A Predictable Tail Latency Mode Coordinator for Shared NVMe SSD with Multiple Hosts

T. Roy, J. Gupta, K. Kant, A. Pal, D. Minturn, and A. Tavakkol
IEEE International Conference on Networking, Architecture and Storage (NAS)
pp. 1 - 6, 2021.

Quick Generation of SSD Performance Models Using Machine Learning

M. Tarihi, S. Azadvar, A. Tavakkol, H. Asadi, and H. Sarbazi-Azad
IEEE Transactions on Emerging Topics in Computing (TETC)
Vol. 10, No. 4, pp. 1821 - 1836, 2021.

ITAP: Idle-Time-Aware Power Management for GPU Execution Units

M. Sadrosadati, S. B. Ehsani, H. Falahati, R. Ausavarungnirun, A. Tavakkol, M. Abaee, L. Orosa, Y. Wang, H. Sarbazi-Azad, and O. Mutlu
ACM Transactions on Architecture and Code Optimization (TACO)
Vol. 16, No. 1, pp. 3:1 - 3:33, 2018.

Dataplant: In-DRAM Security Mechanisms for Low-Cost Devices

L. Orosa, Y. Wang, I. Puddu, M. Sadrosadati, K. Razavi, J. G\'{o}mez-Luna, H. Hassan, N. Mansouri-Ghiasi, A. Tavakkol, M. Patel, J. Kim, V. Seshadri, U. Kang, S. Ghose, R. Azevedo, and O. Mutlu
Preliminary version in arxiv

Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions

A. Tavakkol, A. Kolli, S. Novakovic, K. Razavi, J. Gómez-Luna, H. Hassan, C. Barthels, Y. Wang, M. Sadrosadati, S. Ghose, A. Singla, P. Subrahmanyam, and O. Mutlu
Preliminary version in arxiv

Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration

Y. Wang, A. Tavakkol, L. Orosa, S. Ghose, N. M. Ghiasi, M. Patel, J. S. Kim, H. Hassan, M. Sadrosadati, and O. Mutlu
in 51st International Symposium on Microarchitecture (MICRO '18)
pp. 298 - 311, 2018.

FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives

A. Tavakkol, M. Sadrosadati, S. Ghose, J. Kim, Y. Luo, Y. Wang, N. M. Ghiasi, L. Orosa, J. Gómez-Luna, and O. Mutlu
in 45th International Symposium on Computer Architecture (ISCA '18)
pp. 397 - 410, 2018.

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

A. Tavakkol, J. Gómez-Luna, M. Sadrosadati, S. Ghose, and O. Mutlu
in 16th USENIX Conference on File and Storage Technologies (FAST '18)
pp. 49 - 66, 2018.

Performance Evaluation of Dynamic Page Allocation Strategies in SSDs

A. Tavakkol, P. Mehrvarzy, M. Arjomand, and H. Sarbazi-Azad
in ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS)
Vol. 1, No. 2, pp. 7:1 - 7:33, 2016.

TBM: Twin Block Management Policy to Enhance the Utilization of Plane-Level Parallelism in SSDs

A. Tavakkol, P. Mehrvarzy, and H. Sarbazi-Azad
in Computer Architecture Letters (CAL)
Vol. 15, No. 2, pp. 121 - 124, 2016.

Design for Scalability in Enterprise SSDs

A. Tavakkol, M. Arjomand, and H. Sarbazi-Azad
in Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT '14)
pp. 417 - 430, 2014.

Unleashing the Potentials of Dynamism for Page Allocation Strategies in SSDs

A. Tavakkol, M. Arjomand, and H. Sarbazi-Azad
in Proceedings of the 2014 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '14)
pp. 551 - 552, 2014.

Leveraging HPC Power for Solving Abstract Mathematical Problems

E. Totoni, A. Tavakkol, G.B. Khosrovshahi, A. Khonsari, and H. Sarbazi-Azad
in CSI Journal on Computer Science and Engineering (JCSE)
Vol. 11, No. 2, pp. 1 - 14, 2014.

Network-on-SSD: A Scalable and High-Performance Communication Design Paradigm for SSDs

A. Tavakkol, M. Arjomand, and H. Sarbazi-Azad
in Computer Architecture Letters (CAL)
Vol. 12, No. 1, pp. 5 - 8, January 2013.

Application-Aware Topology Reconfiguration for On-Chip Networks

M. Modarressi, A. Tavakkol, and H. Sarbazi-Azad
in IEEE Transactions on Very Large Scale Integration Systems (TVLSI)
Vol. 19, No. 11, pp. 2010 - 2022, 2011.

Supporting Non-contiguous Processor Allocation in CMPs Using Virtual Point-to-point Links

M. Asadinai, M. Modarressi, A. Tavakkol, and H. Sarbazi-Azad
in Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE 2011)
pp. 1 - 6, 2011.

Energy-Optimized On-Chip Networks Using Reconfigurable Shortcut Paths

N. Teimouri, M. Modarressi, A. Tavakkol, and H. Sarbazi-Azad
in Lecture Notes in Computer Science Volume 6566, ARCS 2011
pp. 231 - 242, 2011.

Virtual Point-to-Point Connections for NoCs

M. Modarressi, A. Tavakkol, and H. Sarbazi-Azad
in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)
Vol. 29, No. 6, pp. 855 - 868, June 2010.

An Efficient Dynamically Reconfigurable On-Chip Network Architecture

M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol
in Proceedings of the 2010 47th ACM/IEEE Design Automation Conference (DAC 2010)
pp. 166 - 169, 2010.

Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections

M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol
in Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip (NOCS'09)
pp. 203 - 212, 2009.

Mesh Connected Crossbars: A Novel NoC Topology with Scalable Communication Bandwidth

A. Tavakkol, H. Sarbazi-Azad, and R. Moraveji
in Proceedings of the International Symposium on Parallel and Distributed Processing with Applications (ISPA'08)
pp. 319 - 326, 2008.

Mathematical Analysis of Buffer Sizing for Network-on-Chips Under Multimedia Traffic

A. Khonsari, M. R. Aghajani, A. Tavakkol, and M. S. Talebi
in Proceedings of the IEEE International Conference on Computer Design (ICCD 2008)
pp. 150 - 155, 2008.

Energy Analysis of Re-Injection Based Deadlock Recovery Routing Algorithms

H. Kooti, M. Mirza-Aghatabar, A. Tavakkol, and S. Hessabi
in Proceedings of the International Symposium on System-on-Chip (SOC 2008)
pp. 1 - 4, 2008.

Virtual Point-to-Point Links in Packet-Switched NoCs

M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol
in Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'08)
pp. 433 - 436, 2008.

The Effect of Network Topology and Channel Labels on the Performance of Label-Based Routing Algorithms

R. Moraveji, H. Sarbazi-Azad, and A. Tavakkol
in Lecture Notes in Computer Science Volume 5101, ICCS 2008
pp. 529 - 538, 2008.

Adaptive Software-Based Deadlock Recovery Technique

M. Mirza-Aghatabar, A. Tavakkol, H. Sarbazi-Azad, and A. Nayebi
in Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - Workshops (AINAW)
pp. 514 - 519, 2008.

Professional Services


  • One Day Workshop on Memory Systems, Sharif University of Technology, Terhan, Iran, Oct. 2014.
  • Two Day Workshop on Multicore Programming, IPM, Terhan, Iran, Feb. 2013.
  • One Day Workshop on GPU Programming, IPM, Tehran, Iran, May 2010.
  • One Day Workshop on IBM Cell/BE Programming, IPM, Tehran, Iran, Apr. 2010.


  • ACM Transactions on Storage (TOS)
  • ACM Computing Surveys
  • IEEE Transactions on Computers
  • Microprocessors and Microsystems Journal (MICPRO)
  • Journal of Computers and Electrical Engineering
  • Journal of Computer and System Sciences
  • Journal of Simulation Modelling Practice and Theory
  • Journal of Nano Communication Networks
  • Cluster Computing Journal
  • Journal of Systems Architecture (JSA)
  • CSI Journal on Computer Science and Engineering (JCSE)
  • Euromicro PDP2013, PDP2014
  • CSI CADS2010, CADS2013


  • IEEE


TA, ETH Zürich, Zürich, Switzerland.

Instructor, Sharif University of Technology, Tehran, Iran.

Instructor, Azad University, Tehran East Branch, Tehran, Iran.

  • Advanced Programming Fall 2007, Spring 2008
  • Computer Graphics Fall 2008
  • Machine Organization and Assembly Language Spring 2008

Academic Projects

Design and implementation of scientific high-performance systems, 2007 - 2008.

  • A 2 teraflops scientific HPC platform implemented based on IBM CELL/BE processors.
  • A 13 teraflops scientific HPC platform was implemented based on Nvidia GPUs.
  • Famous parallel algorithms were implemented to examine peak and sustained performance of these platforms. Examples include: matrix multiplication, wavelet transform, fast fourier transform, N-body simulation, prime number generation and graphical ray tracing.
  • Different software modules and libraries were developed to speedup abstract mathematical problems in graph theory, design theory, and algebraic fields.
  • Some results were published in CSI JCSE:
    Techniques for Utilizing Capabilities of Emerging Chip Multiprocessors in Enumerative Combinatorial Problems
    E. Totoni, A. Tavakkol, Gholamreza B. Khosrovshahi, A. Khonsari, and H. Sarbazi-Azad
    in CSI Journal on Computer Science and Engineering (JCSE)
    Vol. 11, No. 2, pp. 1 - 14, 2014.

Xmulator: An object oriented multi-layered simulation framework, 2006 - present.

  • A detailed SSD simulation platform was implemented. It was referenced in more than 5 publications.
  • A power calculation methodology for NoCs was implemented based on Orion library. It was referenced in more than 30 publications and 20 PhD and MSc thesis.
  • Well-known and special topologies of NoC and interconnection networks were implemented. They were referenced in more than 20 publications and 10 PhD and MSc thesis.

MSc. Thesis: Performance of Crossbar-based Interconnection Networks for Multiprocessors.

  • The issue of pin constraint for traditional multiprocessor or multicomputer systems and routing limitation of wires in physical design of Multiprocessor Systems-on-Chips (MPSoCs) have been always controversial to designers. There is a trade off between communication bandwidth and the aforementioned constraints.
    In my MSc. thesis, I introduced a new class of interconnection topologies, called Crossbar-based Networks, to solve the pin-out constraint in interconnection networks and wiring complexity problem in Network-on-Chips (NoCs). The main idea behind this newly introduced class is to find fully connected sub-graphs in communication graph of a topology and substitute all of the communication channels in each subgraph with a single crossbar switch. This substitution does not remove communication ability between nodes of the mentioned sub-graph, but it can reduce the number of physical communication links. Therefore, the node degree and the number of wires required to implement communication structure will be greatly reduced and designer can overcome implementation physical limitations.
    In addition, I introduced a new topology with the name of Diagonal Connected Mesh (DCM) which modifies well-known Mesh and Torus topologies to provides the possibility of using crossbar-based communication. Besides, I investigated the topological properties of DCM, and proposed deterministic and fully adaptive deadlock free routing algorithms for this new topology.

BSc. Final Project: An image watermarking framework using discrete Wavelet Transform to protect image databases against unauthorized modifications, 2004 - 2005.

  • In this project a framework for image watermarking was designed and implemented based on the properties of wavelet transform. This framework is used for protection of image databases against unauthorized modifications.

Software Projects

A web-based document indexing/provisioning system, 2009 - 2010.

  • A web-based document management/sharing system was designed and implemented using J2EE for Institute for Research in Fundamental Sciences (IPM).
  • IPM researchers use this system to share scientific documents in pdf, word, and text format.
  • Documents are indexed based on content, title, keywords, authors, and category and users can perform mixed searches based on document content and properties.
  • System can automatically check for duplicate uploaded documents and sends alerts to both document uploader and system administrators.
  • System provides dynamic access control policy for different types of users.

Design, setup and installation of a unified Linux-based web-hosting system for IPM information center, 2008 - 2009.

  • All web-related services of IPM, including its official home page, official home pages of its schools, a J2EE-based portal of research projects and employee information, and conference registration pages, were migrated from a Windows 2003 Server to RHEL 5.0.
  • Apache Webserver, Tomcat container, MySQL, and MS SQL Server 2005 were installed and configured to run J2EE, .NET and PHP web applications.
  • A backup server was set up and a backup policy was designed and implemented.
  • A development server was set up and version control and deployment policies were designed and implemented.
  • A unified user home directory management policy was implemented and protocols were defined for user management.
  • An update policy for OS and other server packages was defined and implemented.
  • The project was managed under my supervision.

An automated event management system for IPM School of Particles, 2009.

  • A web-based JSP system was designed and implemented for automated event (conference/lecture/symposium) management in IPM School of Particles.
  • System operator just provides event description, required registration fields, and required web pages for the event.
  • The system automatically generates registration forms and required web pages such as: event desription, venue information, event program, event organizers, and etc..
  • The system provides registration control panel to perform management tasks such as: open/close registration, confirm registration, send email to registrants, two step registration.
  • Registration control panel also provides reporting services based on registration input data.

JDB: A distributed data gathering and analysis system for medical research projects, 2008 - 2009.

  • This is a web-based client/server system that is used for distributed gathering of individual medical information in Shahid Beheshti University of Medical Sciences (SBMU), School of Dentistry. SBMU ranked among the top three medical schools in Iran.
  • The software was developed using C# and .NET Framework.
  • Users can perform complicated statistical analysis of data including: descriptive statistics, bivariate statistics, linear regression, cluster analysis (K-means and hierarchical).
  • Other features include: data import/export to xls/csv/xml format, database merging, online RESTful data provisioning in both JSON and XML formats, data visualization (2D & 3D charts, line charts, bar charts, pi chart, etc.).

Code Snippets| Academic and scientific


MQSim is a fast and accurate simulator modeling the performance of modern multi-queue (MQ) SSDs as well as traditional SATA based SSDs. MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and the full end-to-end latency of requests in modern SSDs.

Download source in C++ | Version 1.0


DTA is a Disk Trace manipulation/Analyzer tool. It converts traces generated by Linux blktrace or Event Tracing for Windows to ascii format. It can also provide statistical analyzes of ascii disk traces.

Download source in C | version 1.5 Download source in C# | version 1.1


Xmulator is an object oriented event-based simulator software for interconnection networks and wireless networks. I contributed in the packages required for Network-on-Chip (NoCS) simulation. Xmulator uses Orion power library for power and energy estimation.

Download | version 7.0