#vision_transformer

Vision transformer

Variant of Transformer designed for vision processing

A vision transformer (ViT) is a transformer designed for computer vision. A ViT breaks down an input image into a series of patches, serialises each patch into a vector, and maps it to a smaller dimension with a single matrix multiplication. These vector embeddings are then processed by a transformer encoder as if they were token embeddings.

Tue 17th

Provided by Wikipedia

Learn More
0 searches
This keyword has never been searched before
This keyword has never been searched for with any other keyword.