Models
Background Replacement is a powerful tool that enables users to easily change the background of their images, opening up endless possibilities for creative transformations and visual enhancements.
by @AIOZNetwork
The Depth Estimation task involves determining the distance or depth information of objects in a given scene or image. It is a computer vision task that utilizes various techniques and algorithms to estimate the relative distances between objects and their positions in three-dimensional space.
by @AIOZNetwork
Jak's Woolitize Image Generator
Jak's Woolitize Image Generator is a text to image task that focuses on applying a woolitize texture and appearance to generated images, creating images that convey warmth.
by @AIOZNetwork
Story Generator
This task aims to create stories or paragraphs automatically based on pre-programmed patterns and rules.
by @AIOZNetwork
Anime Style Image Generator
The task is a powerful tool that generates anime-style images based on text descriptions or prompts.
by @AIOZNetwork
Speed Recognition by Fairseq S2T
S2T is an end-to-end sequence-to-sequence transformer model. It is trained with standard autoregressive cross-entropy loss and generates the transcripts autoregressively.
by @AIOZNetwork
TAPEX: Table Pre-training via Learning a Neural SQL Executor
TAPEX (Table Pre-training via Execution) is a conceptually simple and empirically powerful pre-training approach to empower existing models with table reasoning skills. TAPEX realizes table pre-training by learning a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries.
by @AIOZNetwork
TAPAS: Weakly Supervised Table Parsing via Pre-training
TAPAS is a BERT-like transformers model pretrained on a large corpus of English data from Wikipedia in a self-supervised fashion. This means it was pretrained on the raw tables and associated texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts.
by @AIOZNetwork
Image Restoration by Deraindrop
CMFNet achieves competitive performance on three tasks: image deblurring, image dehazing and image deraindrop.
by @AIOZNetwork
Image Restoration by Deblur
CMFNet achieves competitive performance on three tasks: image deblurring, image dehazing and image deraindrop.
by @AIOZNetwork
Longformer for SQuADv2
Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
by @AIOZNetwork
Image to Text by Pix2Struct
Pix2Struct is an image encoder - text decoder model that is trained on image-text pairs for various tasks, including image captionning and visual question answering.
by @AIOZNetwork
Document Visual Question Answering
Donut model fine-tuned on DocVQA. It was introduced in the paper OCR-free Document Understanding Transformer by Geewok et al. Donut consists of a vision encoder (Swin Transformer) and a text decoder (BART). Given an image, the encoder first encodes the image into a tensor of embeddings (of shape batch_size, seq_len, hidden_size), after which the decoder autoregressively generates text, conditioned on the encoding of the encoder.
by @AIOZNetwork
Image Captioning using ViT and GPT2
Image captioning refers to the process of generating a descriptive and meaningful textual description for an image. It involves utilizing computer vision techniques and natural language processing to analyze the visual content of an image and generate a coherent caption that accurately represents its content.
by @AIOZNetwork
Filling Mask with Bert
Filling Mask AI is a technique or tool used to fill in missing or occluded regions in an image using artificial intelligence algorithms. It is particularly useful when there are areas in an image that are obscured, damaged, or need to be replaced.
by @AIOZNetwork
Image-Guided Object Detection with OWL-ViT
You can use OWL-ViT to query images with text descriptions of any object or alternatively with an example / query image of the target object. To use it, simply upload an image and a query image that only contains the object you're looking for. You can also use the score and non-maximum suppression threshold sliders to set a threshold to filter out low probability and overlapping bounding box predictions.
by @AIOZNetwork
Sentiment classification is the automated process of identifying and classifying emotions in text as positive sentiment, negative sentiment, or neutral sentiment based on the opinions expressed within.
by @AIOZNetwork
Zero-shot text classification is a technique used in natural language processing (NLP) to classify text into predefined categories without requiring any labeled training data for those specific categories.
by @AIOZNetwork
The Vehicle Classification task involves automatically categorizing vehicles based on their visual characteristics, such as shape, size, and appearance. It combines computer vision techniques and machine learning algorithms to analyze images containing vehicles and assign them to specific classes or categories.
by @AIOZNetwork
Musical instrument classification is the task of automatically recognizing and categorizing different musical instruments from audio recordings or spectrograms. It involves identifying the unique characteristics and sound patterns associated with each instrument to determine its class or type.
by @AIOZNetwork