Human-centric Trustworthy Computer Vision

From Research to Applications

In Conjunction with ICCV 2021

October 17 2021, Virtually

News !

  • July 23, 2021:  We have postponed the submission deadline by one week.

  • July 23, 2021:  The top ranking papers will be recommended to ACM TOMM Special Issue.

  • June 23, 2021:  The CMT submissions website:

  • June 15, 2021:  The workshop date is presented.

  • May 10, 2021:   The website is coming. Call for papers.


How to define, pursue and evaluate trustworthy technologies for human-centric computer vision tasks?

   With the rapid technical progress in computer vision and the spread of vision-based applications over the past several years, the human-centric computer vision technologies, such as person re-identification, face recognition, action recognition, etc., are quickly becoming an essential enabler for many fields. Although, it brings great value to individuals and society, it is also encounters a variety of novel ethical, legal, social, and security challenges. Particularly, in recent years, the multiple multimedia sensing technologies as well as the large-scale computing and storage infrastructures are stimulating at a rapid velocity a wide variety of human-centric big data, which provides rich knowledge to help the development of these applications. Meanwhile, such data contains a large amount of personal private information, bringing concerns about the safety and trustworthiness of computer vision technologies. Consequently, trustworthy computer vision has been attracting an increasing attention from academia and industry. It focuses on human-oriented, fair, robust, interpretable, and responsible vision technologies, and is also at the core of the next-generation artificial intelligence (AI). The goal of this workshop is to: 1) bring together the state-of-the-art research on human-centric vision analysis for trustworthy AI; 2) call for a coordinated effort to understand the opportunities and challenges emerging in human-centric trustworthy vision technologies; 3) explore the fairness, robustness, interpretability and accountability oriented to human; 4) showcase innovative methodologies and ideas; 5) introduce interesting real-world human-oriented trustworthy systems and applications; 6) give insight into industry’s practice of trustworthy AI for human-centric vision and discuss future directions. We solicit original contributions in all fields of trustworthy human analysis to help us better understand the nature of vision algorithms for real-world applications. We hope the workshop offer a timely collection of research updates to benefit the researchers and practitioners working in the broad computer vision, pattern recognition, and trustworthy AI communities.

Call for papers

   We invite submissions for ICCV 2021 Workshop, Human-centric Trustworthy Computer Vision: From Research to Applications (HTCV2021), that brings researchers together to discuss human-oriented, fair, robust, interpretable, and responsible technologies for human-centric vision analysis. We solicit original research and survey papers from 5 to 8 pages (excluding references and appendices). Each submitted paper will be double-blind peer reviewed by at least three reviewers. All accepted papers will be presented as either oral or poster presentations, with a best paper award, and appear in the CVF open access archive. Papers submission is through HTCV2021 CMT and must follow the same policies and submission guidelines described in ICCV'21 Author Guidelines. Papers that violates the anonymity, do not use the ICCV submission template or have more than 8 pages (excluding references and appendices) will be rejected without review. In submitting a manuscript to this workshop, the authors acknowledge that no paper substantially similar in content has been submitted to another workshop or conference during the review period.
  The scope of this workshop includes, but is not limited to, the following topics:

  • Adversarial attack and defense in face recognition and person re-identification
  • Explainable face and body analysis, generation and edition
  • Robust human body and face representation learning
  • Face anti-spoofing and deep-fake detection
  • Robust gait and action recognition
  • Secured federated learning
  • Robustness against evolving attacks in computer vision
  • Fairness analysis for data and models of face or human recognition
  • Trustworthy algorithms, frameworks, and tools for Human-centric Trustworthy Computer Vision

  The top ranking papers will be recommended to ACM TOMM Special Issue.

Important Dates

Description Date (Pacific Time)
Submission Deadline August 8, 2021 (11:59PM)
Decisions to Authors August 14, 2021 (11:59PM)
Camera-ready Due August 18, 2021 (11:59PM)
Workshop Date October 17, 2021 (Afternoon)


  Time (EDT) Session Speaker
1:00 PM-1:10 PM Opening Host
1:10 PM-1:55 PM Invited speak 1 "Physical and digital fake face detection"  Zhen Lei
1:55 PM-2:40 PM Invited speak 2 "Probabilistic Modeling for Human Mesh Recovery"  Georgios Pavlakos
2:40 PM-3:25 PM Invited speak 3 "Pitfalls on the Road to Trust in Computer Vision"  Karthik Nandakumar
3:25 PM-4:10 PM Invited speak 4 "Challenges of Creating Affective Computational Tools for
Behavioral and Clinical Sciences"  Albert Ali Salah
4:10 PM-4:25 PM Oral presentation 1 “Multi-Perspective Features Learning for Face Anti-Spoofing”
4:25 PM-4:40 PM Oral presentation 2 “Rethinking Common Assumptions to Mitigate Racial Bias in
Face Recognition Datasets”
4:40 PM-4:55 PM Oral presentation 3 “Transformer Meets Part Model: Adaptive Part Division for
Person Re-Identification”
4:55 PM-5:10 PM Oral presentation 4 “SVEA: A Small-scale Benchmark for Validating the Usability of
Post-hoc Explainable AI Solutions in Image and Signal Recognition”
5:10 PM-5:20 PM Best Paper Announcement Host
5:20 PM-5:55 PM Poster presentation Pre-recorded videos
5:55 PM-6:00 PM Ending Host

Invited speakers

 Zhen Lei
Title: Physical and digital fake face detection
Abstract: Physical and digital face attack are two common face attacks in reality. In many applications, we have to detect these two attacks to assure the security of the system. Physical fake face detection, also known as presentation attack detection or face anti-spoofing, is used to judge the genuine/fake face in front of a camera. The digital fake face detection is used to detect face forgery in the internet. In the first part, I will report recent progress on how to learn the optimal supervision signal so that the model can be better learned. In the second part, I’ll introduce a 3D decomposition based digital fake face detection method and show its effectiveness in face forgery detection task.

 Georgios Pavlakos
UC Berkeley, USA
Title: Probabilistic Modeling for Human Mesh Recovery
Abstract: Reconstructing humans in 3D from a single image is an inherently ambiguous problem since multiple 3D poses can lead to the same reprojection. However, most related works only return one pose estimate for each input image. This fails to capture the multimodal aspect of the problem and results in systems with potentially non-interpretable or non-trustworthy behavior. In this talk, I will discuss our recent work that tries to embrace this multimodality and recasts the problem as learning a mapping from the input to a distribution of plausible 3D poses. Our approach is based on the normalizing flows model and offers a series of advantages. For conventional applications, where a single 3D estimate is required, our formulation allows for efficient mode computation. Using the mode leads to performance that is comparable with the state of the art among deterministic unimodal regression models. Simultaneously, we demonstrate that our model is useful in a series of downstream tasks, where we leverage the probabilistic nature of the prediction as a tool for more accurate estimation. These tasks include reconstruction from multiple uncalibrated views, as well as human model fitting, where our model acts as a powerful image-based prior for mesh recovery.

 Karthik Nandakumar
MUZUAI, Abu Dhabi
Title: Pitfalls on the Road to Trust in Computer Vision
Abstract: Computer vision systems have had a profound positive impact on human lives in applications ranging from smartphone access control to self-driving vehicles and automated medical image diagnosis. While an exponential increase in data availability, combined with algorithmic advancements in machine learning such as deep neural network models and rapid improvements in computational capabilities have powered the growth in computer vision over the past two decades, the field is at a crossroad now due to a lack of trust among human users. Numerous concerns over the trustworthiness of computer vision systems have been raised by various stakeholders and these include: (i) safety and robustness of computer vision algorithms against adversarial attacks, (ii) potential for causing discriminatory harm against specific population groups, (iii) breach of data privacy and confidentiality, (iv) inability to provide explainable decisions, and (v) overall lack of accountability and auditability. In this talk, we will first review the above pitfalls on the road to trust and the interactions among them. Next, we discuss solutions that have been proposed in the literature to address these concerns. Finally, we identify possible research directions that can facilitate the safe navigation of these pitfalls and take us to the eventual destination of developing computer vision systems that can be trusted by human users.

 Albert Ali Salah
Utrecht University, Netherlands
Title: Challenges of Creating Affective Computational Tools for Behavioral and Clinical Sciences
Abstract: Automatic analysis of human affective and social signals brought computer science closer to social sciences and, in particular, enabled collaborations between computer scientists and behavioral scientists. In this talk, I highlight the main research areas in this burgeoning interdisciplinary area, and provide an overview of the opportunities and challenges. Drawing on examples from our recent research, such as automatic analysis of interactive play therapy sessions with children, and diagnosis of bipolar disorder from multimodal cues, as well as relying on recent examples from the growing literature, I will explore the potential of human-AI collaboration, where AI systems do not replace, but support monitoring and human decision making in behavioral and clinical sciences. I conclude with some controversial issues that may point out what "trustworthy AI" would mean for the technologies developed in this domain.


Jingen Liu
JD AI Research, USA
Sifei Liu
Nvidia Research, USA
Wu Liu
JD AI Research, China
Nicu Sebe
UniTN, Italy
Hailin Shi
JD AI Research, China

Committee Chairs

Qian Bao
JD AI Research, China
Yibo Hu
JD AI Research, China
Junbo Guo
People's Daily Online, China


PC members


Advisory board

If you have any questions, feel free to contact < >

The workshop is organized in collaboration with JD AI Research, NVIDIA Research, University of Trento 
and State Key Laboratory of Communication Content Cognition, People's Daily Online.