## February 8, 2022

### Concepts and Profunctor Nuclei

#### Posted by Simon Willerton

Guest post by Matthew Di Meglio and Owen Lewis

Previously on this blog, Simon Willerton introduced the notion of the nucleus of an enriched profunctor, and alluded to the fact that the formal concept lattice of a relation is a particular instance of this general categorical construction. Continuing from Simon’s posts, we will present, through the lens of enriched category theory, a generalization of Formal Concept Analysis called Quantitative Concept Analysis (QCA). The material in this blog post is based on Dusko Pavlovic’s Quantitative Concept Analysis and Jonathan Elliot’s On the Fuzzy Concept Complex.

### Background and running example

QCA aims to extract latent concepts from a dataset which relates a collection of objects to a collection of attributes. For example, suppose that we work for a movie streaming service, and we have a dataset consisting of each user’s rating of each movie (for simplicity, we will assume that all users have rated all movies). Here, the users are the objects and the movies are the attributes. In this context, the word “genre” is a good synonym for the word “concept”, and genres will later be shown be encoded in the nucleus of a particular enriched profunctor. Intuitively speaking, associated to each genre is the collection of movies in the style of that genre and the collection of users who like that genre. Thus the notion of genre should simultaneously capture similarity in users’ tastes and similarity in movies’ styles. QCA is a processes for discovering genres in our dataset; these genres may then be used to decide which movies should be surfaced to which users.

To make our example concrete, our sets of users and movies will be respectively the sets $X = \{\mathtt{Cara}, \mathtt{Dan}, \mathtt{Evie}\} \qquad \text{and} \qquad Y = \{\mathtt{Se7en}, \mathtt{Tarzan}\}.$ Our dataset is expressed as the $X \times Y$-matrix $M = \left[ \begin{array}{r|cc} & \mathtt{Se7en} & \mathtt{Tarzan} \\ \mathtt{Cara} & 5/5 & 4/5\\ \mathtt{Dan} & 3/5 & 1/5\\ \mathtt{Evie} & 2/5 & 4/5 \end{array}\right]$ The $(x, y)$-entry of $M$ is a rating out of $5$ indicating the degree to which user $x$ liked movie $y$, from $0/5$ meaning very little to $5/5$ meaning a lot. We will think of these ratings as real numbers in the interval $V = [0,1]$.

### Enriching category and fuzzy logic

There is a way to view our set $V$ of possible ratings as the object set of a symmetric monoidal closed category $\mathcal{V}$ such that certain constructions from $\mathcal{V}$-enriched category theory, such as $\mathcal{V}$-functors and $\mathcal{V}$-profunctors, turn out to be relevant to QCA. Given objects $x$ and $y$ of $\mathcal{V}$,

• there is exactly one morphism from $x$ to $y$ when $x \leqslant y$ and no morphisms from $x$ to $y$ otherwise ($\mathcal{V}$ is a thin category);
• the monoidal product of $x$ and $y$ is their product $x \cdot y$; and
• the internal hom from $x$ to $y$ is their truncated division $x \backslash y = \begin{cases} y/x &\text{if } y \lt x,\\ 1 &\text{if } y \geq x. \end{cases}.$ Note also that the category $\mathcal{V}$ is complete and cocomplete, with limits and colimits given respectively by infima and suprema.

Viewing $V$ as the truth values of a fuzzy logic, from $0$, meaning completely false, to $1$, meaning completely true, the above categorical notions have natural interpretations in fuzzy logic, summarised in the following table:

Category Theory Logic $\mathcal{V}$
Initial object False $0$
Terminal object/Monoidal unit True $1$
Monoidal product operator Binary conjunction operator $\cdot$
Exists-a-morphism relation Entailment relation $\leqslant$
Internal hom operator Implication operator $\backslash$
Limits Universal quantification $\inf$
Colimits Existential quantification $\sup$

Above, we said that the truncated division operator $\backslash$ may be interpreted as the logical implication operator. That is, given fuzzy truth values $x$ and $y$, we claim that $x \backslash y$ should be interpreted as the fuzzy truth value of the proposition that $x$ implies $y$. Referring back to the definition of truncated division, we see that

• if $y$ is at least as true as $x$, then the proposition $x$ implies $y$ is completely true, i.e. $x \backslash y = 1$; and
• if $y$ is less true than $x$, then the truth value $y / x$ of the proposition $x$ implies $y$ depends on how much $y$ is less true than $x$.

For the above interpretation of limits and colimits of diagrams as universal and existential quantification of predicates, we interpret diagrams in $\mathcal{V}$ as modelling predicates valued in $V$.

### Enriched presheaves and fuzzy subsets

As stated earlier, a genre (concept) will consist of a sub-collection of the set $X$ of all users and a sub-collection of the set $Y$ of all movies. By a sub-collection of a set $Z$, we mean a function on $Z$ valued in our set $[0, 1]$ of truth values; we will refer to such functions on $Z$ as predicates on $Z$. In binary-valued, non-fuzzy logic, predicates are in bijection with the subsets of $Z$; given a predicate $P \colon Z \to \{\top, \bot\}$, the corresponding subset of $Z$ is the preimage of $\top$ in $P$. In the QCA case, we may think of such predicates as fuzzy subsets of $Z$; given a predicate $P \colon Z \to [0, 1]$ and an element $z \in Z$, we may think of $P(z)$ as the truth value of the proposition that $z$ belongs to the fuzzy subset represented by $P$.

In binary-valued logic, the set of all predicates $P \colon Z \to \{\top, \bot\}$ has the structure of a complete and cocomplete preordered set (aka proset) (it is actually a complete and cocomplete partially ordered set, which is also known as a complete lattice or suplattice). Under the correspondence with subsets of $Z$, this is nothing other than the power set of $Z$ with subset inclusion as the preorder. Limits are given by intersection of subsets, and colimits are given by union of subsets.

In the fuzzy QCA setting, the analogous structure on the set of all predicates $P \colon Z \to [0, 1]$ is that of a proxet. A proximity set or proxet is a set $Z$ together with a function $(-,-)_Z \colon Z \times Z \to [0, 1]$, called the proximity map, which satisfies:

• reflexivity: $(z, z)_Z = 1$ for all $z \in Z$, and
• transitivity: $(z_1, z_2)_Z \cdot (z_2, z_3)_Z \leq (z_1, z_3)_Z$ for all $z_1, z_2, z_3 \in Z$.

Under the logical interpretation of multiplication and truncated division given earlier, these axioms look like the axioms of a preorder. Indeed, we may think of proxets as fuzzy preorders, where $(z_1, z_2)_Z$ is the fuzzy truth value of the proposition that $z_1 \preccurlyeq z_2$. Additionally, proxets are closely related to quasimetric spaces, which are metric spaces with the symmetry axiom omitted. Given a proximity function $(-,-)_Z$, the function $-\operatorname{log}\big( (-,-)_Z \big)$ defines a quasimetric. Conversely, given a quasimetric $d(-, -)$, the function $e^{-d(-, -)}$ is a proximity function. Quasimetric spaces viewed as enriched categories are called Lawvere metric spaces.

The proxet structure on the set $[0, 1]^Z$ of fuzzy subsets of $Z$ is given by $(P, Q)_{[0, 1]^Z} = \inf_{z \in Z} \big\{P(z) \backslash Q(z)\big\}.$ Again, under the logical interpretation of infima and truncated division, we can interpret $(P, Q)_{[0, 1]^Z}$ as the fuzzy truth value of the proposition that $P \subseteq Q$ when regarded as fuzzy subsets of $Z$. Pointwise infima and suprema correspond to intersection and union of fuzzy subsets.

By now, those of you familiar with enriched category theory have surely noticed that proxets are precisely (small) $\mathcal{V}$-categories. Indeed, the set of objects of a $\mathcal{V}$-category $\mathbf{Z}$ is the set underlying the preorder or proxet, the hom-object $\mathbf{Z}(z_1, z_2)$ is the truth value of $z_1 \preccurlyeq z_2$ in the binary case and the value of $(z_1, z_2)_Z$ in the QCA case, and the unit and associativity axioms of a $\mathcal{V}$-category give us reflexivity and transitivity in both cases.

Recall that if $\mathcal{V}$ is complete and cocomplete, then for any $\mathcal{V}$-category $\mathbf{Z}$, the category $[\mathbf{Z}^{op}, \mathcal{V}]$ of $\mathcal{V}$-presheaves on $\mathbf{Z}$ is cocomplete with respect to weighted limits and weighted colimits. Here, weighted limits and weighted colimits are computed pointwise. Similarly, the enriched category $[\mathbf{Z}, \mathcal{V}]$ of $\mathcal{V}$-copresheaves on $\mathbf{Z}$ is also complete with respect to weighted limits and weighted colimits.

Regarding a set $Z$ as a discrete $\mathcal{V}$-category $\mathbf{Z}$, predicates $Z \to [0, 1]$ may be seen both as $\mathcal{V}$-presheaves $\mathbf{Z}^{\mathrm{op}} \to \mathcal{V}$ and $\mathcal{V}$-copresheaves $\mathbf{Z} \to \mathcal{V}$. The proxet structure on $[0, 1]^Z$ corresponds exactly to the usual $\mathcal{V}$-category structure on the $\mathcal{V}$-category of $\mathcal{V}$-presheaves and $\mathcal{V}$-copresheaves on $\mathbf{Z}$. Similarly, trivially weighted limits and colimits in these presheaf and copresheaf $\mathcal{V}$-categories correspond respectively to to fuzzy intersections (pointwise infima) and fuzzy unions (pointwise suprema).

### Lifting profunctors

Recall that a genre is a fuzzy set of movies paired with the fuzzy set of people who like those movies. We need a way to relate fuzzy sets of people and fuzzy sets of movies. Our ratings matrix $M$ relates individual people and individual movies; our goal now is to lift this relation to fuzzy sets.

The ratings matrix $M$ is a choice of rating in $V = [0, 1]$ for each (user, movie)-pair. In light of the previous section, this is the same thing as a predicate $X \times Y \to V$, and thus $M$ may be regarded as a $\mathcal{V}$-functor $\mathbf{X}^{\mathrm{op}} \otimes \mathbf{Y} \to \mathcal{V}$, where $\mathbf{X}$ and $\mathbf{Y}$ are the discrete proxets on $X$ and $Y$. The `op’ is included so that $M$ is actually a $\mathcal{V}$-profunctor from $\mathbf{X}$ to $\mathbf{Y}$.

The following construction for lifting $M$ works for a general profunctor. The first step is to curry $M$ in the first argument, yielding the $\mathcal{V}$-functor: $M^\bullet \colon \mathbf{X}^{\mathrm{op}} \to [\mathbf{Y}, \mathcal{V}]$ As described in the previous section, the codomain of this functor is the category of fuzzy subsets of $\mathbf{Y}$.

As the $\mathcal{V}$-category $[\mathbf{X}^{\mathrm{op}}, \mathcal{V}]$ is the free cocompletion of the $\mathcal{V}$-category $\mathbf{X}$, and the unit of the corresponding adjunction is the $\mathcal{V}$-Yoneda embedding $H \colon \mathbf{X} \to [\mathbf{X}^{\mathrm{op}}, \mathcal{V}]$, there is a unique cocontinuous $\mathcal{V}$-functor $M^\star \colon [\mathbf{X}^{{\mathrm{op}}}, \mathcal{V}] \to [\mathbf{Y}, \mathcal{V}]^{{\mathrm{op}}}$ such that the diagram

commutes. $M^\star$ is the desired lifting of $M$.

The density theorem says that each predicate $A \in [\mathbf{X}^{\mathrm{op}}, \mathcal{V}]$ may be written as the weighted colimit $A = \operatorname{colim}^{A}H = \int^{x\in\mathbf{X}}A(x) \otimes H(x),$ and so, by cocontinuity of $M^\star$, we have

(1)$M^\star(A) = M^\star(\operatorname{colim}^{A}H) = \operatorname{colim}^A(M^\star \circ H) = \operatorname{lim}^A M^\bullet = \int_{x\in\mathbf{X}}\mathcal{V}\big(A(x), M^\bullet(x)\big).$

Returning to QCA, recall that in $\mathcal{V}$, limits are pointwise suprema, the tensor product is ordinary multiplication and the internal hom is truncated division. Therefore, we may rewrite (1) as $M^\star(A) = \inf_{x \in \mathbf{X}} \big\{A(x) \backslash M(x, -) \big\}$

If we curry in the second argument rather than the first, we can derive a right adjoint, $M_\star \colon [\mathbf{Y}, \mathcal{V}]^{\mathrm{op}} \to [\mathbf{X}^{\mathrm{op}}, \mathcal{V}]$, of $M^\star$ defined by $M_\star(B) = \inf_{y \in \mathbf{Y}}\big\{B(y) \backslash M(-, y)\big\}.$

### Concepts as fixed points

Our constructions of $M^\star$ and $M_\star$ have a close analogy in linear algebra, discussed in this earlier post by Simon. The upshot is that our functor $M^\star$ is a bit like multiplication by a matrix $M$, and $M_\star$ like multiplication by $M^\intercal$. A common technique for studying the latent information in a matrix is to compute its left and right singular vectors, which are the eignvectors of $MM^\intercal$ and $M^\intercal M$. The extraction of our concepts will follow the same pattern, replacing eigenvectors with fixed points.

The set of concepts associated with the $[0, 1]$-valued matrix $M$ is given by $\operatorname{Fix}(M^\star M_\star) \cong \operatorname{Fix}(M_\star M^\star)$ These fixed points are called the nucleus of the profunctor $M$. Profunctor nuclei uncover interesting structure in many domains beyond QCA. As described in other nCafe posts, they give a categorical interpretation of the Legendre-Fenchel transform, and can be specialized to recover the Isbell completion of a category, which is related to the Dedekind MacNeille completion of a partially ordered set, and the tight span of a metric space.

We can now work out some examples of fuzzy concepts from our ratings matrix $M$. Starting with the representable fuzzy subset $H(\mathtt{Cara}) = \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & 0\\ \mathtt{Evie} & 0 \end{array}\right]$ of the set $X$ of users, we have \begin{aligned} M^\star H(\mathtt{Cara}) = \left[\begin{array}{r|cc} \mathtt{Se7en} & \inf \big\{1 \backslash \tfrac{5}{5}, 0 \backslash \tfrac{3}{5}, 0 \backslash \tfrac{2}{5}\big\}\\ \mathtt{Tarzan} & \inf \big\{1 \backslash \tfrac{4}{5}, 0 \backslash \tfrac{1}{5}, 0 \backslash \tfrac{4}{5}\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Se7en} & \inf \big\{\tfrac{5}{5}, 1, 1\big\}\\ \mathtt{Tarzan} & \inf \big\{\tfrac{4}{5}, 1, 1\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Se7en} & 1\\ \mathtt{Tarzan} & \tfrac{4}{5} \end{array}\right] \end{aligned} and \begin{aligned} M_\star M^\star H(\mathtt{Cara}) &= \left[\begin{array}{r|cc} \mathtt{Cara} & \inf \big\{1 \backslash \tfrac{5}{5}, \tfrac{4}{5} \backslash \tfrac{4}{5}\big\}\\ \mathtt{Dan} & \inf \big\{1 \backslash \tfrac{3}{5}, \tfrac{4}{5} \backslash \tfrac{1}{5} \big\}\\ \mathtt{Evie} & \inf \big\{1 \backslash \tfrac{2}{5}, \tfrac{4}{5} \backslash \tfrac{4}{5}\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Cara} & \inf \big\{1, 1\big\}\\ \mathtt{Dan} & \inf \big\{\tfrac{3}{5}, \tfrac{1}{4}\big\}\\ \mathtt{Evie} & \inf \big\{\tfrac{2}{5}, 1\big\} \end{array}\right] = \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & \tfrac{1}{4}\\ \mathtt{Evie} & \tfrac{2}{5} \end{array}\right]\\ \end{aligned} As $M^\star M_\star M^\star = M^\star$ in general, the pair $\big(M_\star M^\star H(\mathtt{Cara}), M^\star H(\mathtt{Cara})\big) = \left( \left[\begin{array}{r|cc} \mathtt{Cara} & 1\\ \mathtt{Dan} & \tfrac{1}{4}\\ \mathtt{Evie} & \tfrac{2}{5} \end{array}\right], \left[\begin{array}{r|cc} \mathtt{Se7en} & 1\\ \mathtt{Tarzan} & \tfrac{4}{5} \end{array}\right] \right)$ is a fuzzy concept of $M$, called the representable fuzzy concept for $\mathtt{Cara}$. The first vector in the pair may be interpreted as the fuzzy set of people with similar taste to $\mathtt{Cara}$ and the second as the fuzzy set of movies that $\mathtt{Cara}$ likes.

Similarly, letting $H' \colon \mathbf{Y} \to [\mathbf{Y}, \mathcal{V}]^{{\mathrm{op}}}$ be the opposite of the Yoneda embedding for $\mathbf{Y}^{{\mathrm{op}}}$, starting with the representable fuzzy subset $H'(\mathtt{Tarzan}) = \left[\begin{array}{r|cc} \mathtt{Se7en} & 0\\ \mathtt{Tarzan} & 1\\ \end{array}\right]$ of the set $Y$ of movies, we obtain the representable fuzzy concept $\big(M_\star H'(\mathtt{Tarzan}), M^\star M_\star H'(\mathtt{Tarzan})\big) = \left(\left[\begin{array}{r|cc} \mathtt{Cara} & \tfrac{4}{5}\\ \mathtt{Dan} & \tfrac{1}{5}\\ \mathtt{Evie} & \tfrac{4}{5} \end{array}\right], \left[\begin{array}{r|cc} \mathtt{Se7en} & \tfrac{1}{2}\\ \mathtt{Tarzan} & 1 \end{array}\right]\right)$ for $\mathtt{Tarzan}$.

The proximity $\mathcal{V}^{\mathbf{Y}}\big(M^\star M_\star H'(\mathtt{Tarzan}), M^\star H(\mathtt{Cara}) \big) = \inf \big\{ \tfrac{1}{2} \backslash 1, 1 \backslash \tfrac{4}{5} \big\} = \inf \big\{1, \tfrac{4}{5} \big\} = \tfrac{4}{5}$ may be interpreted as the fuzzy truth value of the proposition that the fuzzy set of movies similar to $\mathtt{Tarzan}$ is contained in the fuzzy set of movies which $\mathtt{Cara}$ likes. Notice that it is equal to the proximity $\mathcal{V}^{\mathbf{X}}\big(M_\star M^\star H(\mathtt{Cara}), M_\star H'(\mathtt{Tarzan})\big) = \inf \big\{ 1 \backslash \tfrac{4}{5}, \tfrac{1}{4} \backslash \tfrac{1}{5}, \tfrac{2}{5} \backslash \tfrac{4}{5}\big\} = \inf \big\{ \tfrac{4}{5}, 1, 1\big\} = \tfrac{4}{5}.$ This is because $\operatorname{Fix}(M^\star M_\star )$ and $\operatorname{Fix}(M_\star M^\star )$ are isomorphic proxets.

As explained in Jonathan Elliot’s thesis On the Fuzzy Concept Complex, profunctor nuclei can be visualized as certain convex sets in tropical spaces. We leave the image below as a teaser for the thesis.

Posted at February 8, 2022 4:42 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3380

## 1 Comment & 0 Trackbacks

### Re: Concepts and Profunctor Nuclei

Amazing! This flavor of ACT is one of my favorite.

I think in the sentence immediatly after (1) there’s a ‘typo’, it says ‘limits are pointwise suprema’ and then it proceeds to take infima (which is the correct thing afaiu).

Posted by: Matteo Capucci on February 10, 2022 12:09 PM | Permalink | Reply to this

Post a New Comment