{"id":18,"date":"2025-02-10T11:57:52","date_gmt":"2025-02-10T11:57:52","guid":{"rendered":"https:\/\/elbrinner.com\/?p=18"},"modified":"2025-03-30T12:01:54","modified_gmt":"2025-03-30T12:01:54","slug":"azure-video-indexer","status":"publish","type":"post","link":"https:\/\/elbrinner.com\/index.php\/2025\/02\/10\/azure-video-indexer\/","title":{"rendered":"Azure Video Indexer"},"content":{"rendered":"<p>En este art\u00edculo, exploraremos las capacidades de Azure Video Indexer. Azure Video Indexer es un servicio de Microsoft que permite extraer informaci\u00f3n valiosa de nuestros videos. Es importante destacar que no se trata de un an\u00e1lisis en tiempo real.<\/p>\n<p>Los videos se pueden enviar mediante la API o directamente desde el Portal. Una vez procesado el video, podemos recuperar toda la informaci\u00f3n extra\u00edda y guardarla en una base de datos. Adem\u00e1s de descargar el JSON, es posible obtener recursos visuales como fotogramas clave y fotos de personas detectadas.<\/p>\n<p>Hasta aqu\u00ed, podr\u00eda parecer un servicio simple, pero en realidad es un servicio compuesto por muchos otros que tienen la capacidad de extraer informaci\u00f3n tanto del an\u00e1lisis visual como del audio de los videos. Para entender mejor, veamos el diagrama de la cantidad de informaci\u00f3n que se puede extraer.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-21\" src=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-1024x508.png\" alt=\"Video Indexer\" width=\"1024\" height=\"508\" srcset=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-1024x508.png 1024w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-300x149.png 300w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-768x381.png 768w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-1536x762.png 1536w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/model-chart-2048x1016.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>En el diagrama, podemos ver que hay una separaci\u00f3n de la informaci\u00f3n obtenida por audio y la informaci\u00f3n obtenida por video. Vamos a ver en detalle cada una de ellas.<\/p>\n<p><strong>Por audio podemos obtener:<\/strong><\/p>\n<ul>\n<li><strong>Idioma:<\/strong> Es capaz de detectar varios idiomas en el mismo video.<\/li>\n<li><strong>Diarizaci\u00f3n:<\/strong> Analiza el audio y asigna un identificador \u00fanico a cada hablante.<\/li>\n<li><strong>Transcripci\u00f3n:<\/strong> Convierte el audio en texto, y se presenta en dos formatos: por escena y por transcripci\u00f3n individual. Las transcripciones por escena abarcan segmentos m\u00e1s extensos, incluyendo periodos de silencio. Las transcripciones individuales son m\u00e1s cortas y \u00fatiles para buscar di\u00e1logos espec\u00edficos. Sin embargo, en las escenas con silencio, no habr\u00e1 transcripci\u00f3n. Esto puede generar \u00abhuecos\u00bb en la informaci\u00f3n. En mis pruebas, observ\u00e9 que en algunos videos no hab\u00eda informaci\u00f3n transcrita hasta el minuto 2.<\/li>\n<li><strong>An\u00e1lisis de sentimientos:<\/strong> Detecta el tono emocional general del audio, indicando si es positivo, negativo o neutral.<\/li>\n<li><strong>Reconocimiento de emociones:<\/strong> Identifica emociones espec\u00edficas en las voces de los hablantes, como alegr\u00eda, tristeza o enojo.<\/li>\n<li><strong>Efectos de audio:<\/strong> Detecta varios efectos de audio presentes en el contenido, como aplausos, risas o silencio.<\/li>\n<li><strong>Moderaci\u00f3n de contenido textual:<\/strong> Analiza el texto transcrito y detecta lenguaje inapropiado o sensible.<\/li>\n<li><strong>Marcas mencionadas:<\/strong> Reconoce y extrae los nombres de marcas comerciales que se mencionan en el audio o video.<\/li>\n<li><strong>Personas mencionadas:<\/strong> Identifica y extrae los nombres de personas que se mencionan en el contenido.<\/li>\n<li><strong>Lugares mencionados:<\/strong> Reconoce y extrae los nombres de lugares geogr\u00e1ficos que se mencionan.<\/li>\n<li><strong>Extracci\u00f3n de palabras clave:<\/strong> Identifica y extrae las palabras y frases m\u00e1s relevantes del contenido.<\/li>\n<li><strong>Modelado de temas:<\/strong> Agrupa el contenido en temas relevantes, lo que facilita la b\u00fasqueda y organizaci\u00f3n.<\/li>\n<\/ul>\n<p><strong>Por imagen podemos extraer la siguiente informaci\u00f3n:<\/strong><\/p>\n<ul>\n<li><strong>OCR (Reconocimiento \u00d3ptico de Caracteres):<\/strong> Extrae texto de im\u00e1genes dentro del video, como letreros o subt\u00edtulos.<\/li>\n<li><strong>Segmentaci\u00f3n de escenas:<\/strong> Divide el video en segmentos basados en cambios visuales significativos.<\/li>\n<li><strong>Extracci\u00f3n de fotogramas clave:<\/strong> Identifica autom\u00e1ticamente los fotogramas m\u00e1s representativos del video.<\/li>\n<li><strong>Segmentaci\u00f3n de tomas:<\/strong> Divide el video en tomas, que son secuencias de fotogramas tomadas por la misma c\u00e1mara.<\/li>\n<li><strong>Detecci\u00f3n de rostros:<\/strong> Localiza y marca las caras de las personas que aparecen en el video.<\/li>\n<li><strong>Agrupaci\u00f3n de rostros:<\/strong> Agrupa las caras detectadas en el video, identificando cuando la misma persona aparece m\u00faltiples veces.<\/li>\n<li><strong>Selecci\u00f3n del mejor rostro:<\/strong> Dentro de un grupo de rostros, selecciona la imagen de mayor calidad.<\/li>\n<li><strong>Identificaci\u00f3n de rostros:<\/strong> Reconoce a personas conocidas o entrenadas en el sistema.<\/li>\n<li><strong>Personas observadas:<\/strong> Realiza un seguimiento de las personas detectadas mientras se mueven dentro del video.<\/li>\n<li><strong>Detecci\u00f3n de vestimenta:<\/strong> Detecta las prendas de vestir de las personas en el video.<\/li>\n<li><strong>Detecci\u00f3n de objetos:<\/strong> Identifica y etiqueta objetos presentes en el video, como coches o animales.<\/li>\n<li><strong>Etiquetas:<\/strong> Asigna etiquetas descriptivas a elementos visuales clave en el video.<\/li>\n<li><strong>Detecci\u00f3n de cr\u00e9ditos rodantes:<\/strong> Detecta y extrae el texto que aparece en los cr\u00e9ditos finales de un video.<\/li>\n<li><strong>Moderaci\u00f3n de contenido visual:<\/strong> Detecta contenido visual inapropiado o sensible.<\/li>\n<li><strong>Detecci\u00f3n de claqueta:<\/strong> Detecta la claqueta utilizada en la producci\u00f3n de video.<\/li>\n<li><strong>Patrones digitales:<\/strong> Detecta patrones digitales, como c\u00f3digos QR, dentro del video.<\/li>\n<li><strong>Material sin texto:<\/strong> Detecta secciones del video que no contienen texto visual.<\/li>\n<li><strong>Ropa destacada:<\/strong> Identifica y destaca prendas de vestir espec\u00edficas en el video.<\/li>\n<li><strong>Persona coincidida:<\/strong> Permite la b\u00fasqueda de personas que coinciden con una imagen de referencia.<\/li>\n<\/ul>\n<p>Como puedes ver, este servicio hace uso de muchos otros para obtener toda esta informaci\u00f3n.<\/p>\n<p>Lo puedes probar gratis aqu\u00ed: <a href=\"https:\/\/www.videoindexer.ai\/\">https:\/\/www.videoindexer.ai\/<\/a><\/p>\n<p>&nbsp;<\/p>\n<p>El reconocimiento facial de personas no conocidas, como en mi caso, requiere permisos especiales.<\/p>\n<p>En mis pruebas, configur\u00e9 para que me detectara a m\u00ed y a Wally, como puedes ver en la captura.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-22\" src=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/imagen_wally.png\" alt=\"Captura Azure Video Indexer\" width=\"850\" height=\"396\" srcset=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/imagen_wally.png 850w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/imagen_wally-300x140.png 300w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/imagen_wally-768x358.png 768w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Hay m\u00e1s cosas interesantes sobre este servicio. Podemos configurar Azure OpenAI para que nos haga un resumen del video. Esta funcionalidad es muy \u00fatil si pensamos en usarla, por ejemplo, para un buscador de videos sobre reuniones. Podemos elegir resumen corto o largo, adem\u00e1s de un tono formal o informal.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23\" src=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/video_index_captura_3.png\" alt=\"Captura de Azure indexer Video Portal\" width=\"850\" height=\"368\" srcset=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/video_index_captura_3.png 850w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/video_index_captura_3-300x130.png 300w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/video_index_captura_3-768x332.png 768w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Toda la informaci\u00f3n devuelta por el servicio (emociones, objetos, transcripciones, etc.) incluye la duraci\u00f3n y las marcas de tiempo de inicio y fin. Cuando la informaci\u00f3n se refiere al mismo objeto o persona, se presenta en un array. Por ejemplo:<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"json\">\"faces\": [\r\n            {\r\n                \"videoId\": \"hm899uxybd\",\r\n                \"referenceId\": \"0d47c987-0042-5576-15e8-97af601614fa\",\r\n                \"referenceType\": \"Bing\",\r\n                \"confidence\": 0.8044,\r\n                \"description\": \"William Henry Gates III, conocido como Bill Gates, es un magnate empresarial, desarrollador de software, inversor, autor y fil\u00e1ntropo estadounidense. Es cofundador de Microsoft, junto con su difunto amigo de la infancia Paul Allen. Durante su carrera en Microsoft, Gates ocup\u00f3 los cargos de presidente, director ejecutivo, presidente y arquitecto jefe de software, adem\u00e1s de ser el mayor accionista individual hasta mayo de 2014. Fue uno de los principales empresarios de la revoluci\u00f3n de las microcomputadoras de las d\u00e9cadas de 1970 y 1980.\",\r\n                \"title\": \"Empresario estadounidense\",\r\n                \"thumbnailId\": \"b90bd27a-1a25-4a12-968c-15badfa547c4\",\r\n                \"seenDuration\": 70.2,\r\n                \"seenDurationRatio\": 0.061,\r\n                \"id\": 1306,\r\n                \"name\": \"Bill Gates\",\r\n                \"appearances\": [\r\n                    {\r\n                        \"startTime\": \"0:03:07\",\r\n                        \"endTime\": \"0:03:28.9333333\",\r\n                        \"startSeconds\": 187,\r\n                        \"endSeconds\": 208.9\r\n                    },\r\n                    {\r\n                        \"startTime\": \"0:10:09.9333333\",\r\n                        \"endTime\": \"0:10:23.0666667\",\r\n                        \"startSeconds\": 609.9,\r\n                        \"endSeconds\": 623.1\r\n                    }\r\n                ]\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p>Al convertir el JSON devuelto por Azure Video Indexer en clases de C#, obtendr\u00edamos una estructura jer\u00e1rquica, donde &#8216;RootResponse&#8217; representar\u00eda el nodo ra\u00edz<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\">public class RootResponse\r\n    {\r\n        public object partition { get; set; }\r\n        public object description { get; set; }\r\n        public string privacyMode { get; set; }\r\n        public string state { get; set; }\r\n        public string accountId { get; set; }\r\n        public string id { get; set; }\r\n        public string name { get; set; }\r\n        public string userName { get; set; }\r\n        public DateTime created { get; set; }\r\n        public bool isOwned { get; set; }\r\n        public bool isEditable { get; set; }\r\n        public bool isBase { get; set; }\r\n        public int durationInSeconds { get; set; }\r\n        public string duration { get; set; }\r\n        public SummarizedInsights summarizedInsights { get; set; }\r\n        public List&lt;Video&gt; videos { get; set; }\r\n        public List&lt;VideosRange&gt; videosRanges { get; set; }\r\n    }\r\n\r\n    public class Appearance\r\n    {\r\n        public string startTime { get; set; }\r\n        public string endTime { get; set; }\r\n        public double startSeconds { get; set; }\r\n        public double endSeconds { get; set; }\r\n        public double confidence { get; set; }\r\n    }\r\n\r\n    public class AudioEffect\r\n    {\r\n        public string audioEffectKey { get; set; }\r\n        public double seenDurationRatio { get; set; }\r\n        public double seenDuration { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public int id { get; set; }\r\n        public string type { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Block\r\n    {\r\n        public int id { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Brand\r\n    {\r\n        public string referenceId { get; set; }\r\n        public string referenceUrl { get; set; }\r\n        public double confidence { get; set; }\r\n        public string description { get; set; }\r\n        public double seenDuration { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string referenceType { get; set; }\r\n        public List&lt;object&gt; tags { get; set; }\r\n        public bool isCustom { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class DetectedObject\r\n    {\r\n        public int id { get; set; }\r\n        public string type { get; set; }\r\n        public string thumbnailId { get; set; }\r\n        public string displayName { get; set; }\r\n        public string wikiDataId { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Duration\r\n    {\r\n        public string time { get; set; }\r\n        public int seconds { get; set; }\r\n    }\r\n\r\n    public class Emotion\r\n    {\r\n        public string type { get; set; }\r\n        public double seenDurationRatio { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public int id { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Face\r\n    {\r\n        public string videoId { get; set; }\r\n        public string referenceId { get; set; }\r\n        public string referenceType { get; set; }\r\n        public double confidence { get; set; }\r\n        public string description { get; set; }\r\n        public string title { get; set; }\r\n        public string thumbnailId { get; set; }\r\n        public double seenDuration { get; set; }\r\n        public double seenDurationRatio { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string knownPersonId { get; set; }\r\n        public string imageUrl { get; set; }\r\n        public bool highQuality { get; set; }\r\n        public List&lt;Thumbnail&gt; thumbnails { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class FramePattern\r\n    {\r\n        public string displayName { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string patternType { get; set; }\r\n        public int confidence { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Insights\r\n    {\r\n        public string version { get; set; }\r\n        public string duration { get; set; }\r\n        public string sourceLanguage { get; set; }\r\n        public List&lt;string&gt; sourceLanguages { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;string&gt; languages { get; set; }\r\n        public List&lt;Transcript&gt; transcript { get; set; }\r\n        public List&lt;Ocr&gt; ocr { get; set; }\r\n        public List&lt;Keyword&gt; keywords { get; set; }\r\n        public List&lt;Topic&gt; topics { get; set; }\r\n        public List&lt;Face&gt; faces { get; set; }\r\n        public List&lt;Label&gt; labels { get; set; }\r\n        public List&lt;Scene&gt; scenes { get; set; }\r\n        public List&lt;Shot&gt; shots { get; set; }\r\n        public List&lt;Brand&gt; brands { get; set; }\r\n        public List&lt;NamedLocation&gt; namedLocations { get; set; }\r\n        public List&lt;NamedPerson&gt; namedPeople { get; set; }\r\n        public List&lt;AudioEffect&gt; audioEffects { get; set; }\r\n        public List&lt;DetectedObject&gt; detectedObjects { get; set; }\r\n        public List&lt;Sentiment&gt; sentiments { get; set; }\r\n        public List&lt;Emotion&gt; emotions { get; set; }\r\n        public List&lt;Block&gt; blocks { get; set; }\r\n        public List&lt;FramePattern&gt; framePatterns { get; set; }\r\n        public List&lt;Speaker&gt; speakers { get; set; }\r\n        public TextualContentModeration textualContentModeration { get; set; }\r\n        public Statistics statistics { get; set; }\r\n    }\r\n\r\n    public class Instance\r\n    {\r\n        public string adjustedStart { get; set; }\r\n        public string adjustedEnd { get; set; }\r\n        public string start { get; set; }\r\n        public string end { get; set; }\r\n        public string brandType { get; set; }\r\n        public string instanceSource { get; set; }\r\n        public double confidence { get; set; }\r\n        public List&lt;string&gt; thumbnailsIds { get; set; }\r\n        public string thumbnailId { get; set; }\r\n    }\r\n\r\n    public class KeyFrame\r\n    {\r\n        public int id { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Keyword\r\n    {\r\n        public bool isTranscript { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string text { get; set; }\r\n        public double confidence { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Label\r\n    {\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string referenceId { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class NamedLocation\r\n    {\r\n        public string referenceId { get; set; }\r\n        public string referenceUrl { get; set; }\r\n        public double confidence { get; set; }\r\n        public string description { get; set; }\r\n        public double seenDuration { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public List&lt;object&gt; tags { get; set; }\r\n        public bool isCustom { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class NamedPerson\r\n    {\r\n        public string referenceId { get; set; }\r\n        public string referenceUrl { get; set; }\r\n        public double confidence { get; set; }\r\n        public string description { get; set; }\r\n        public double seenDuration { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public List&lt;object&gt; tags { get; set; }\r\n        public bool isCustom { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Ocr\r\n    {\r\n        public int id { get; set; }\r\n        public string text { get; set; }\r\n        public double confidence { get; set; }\r\n        public int left { get; set; }\r\n        public int top { get; set; }\r\n        public int width { get; set; }\r\n        public int height { get; set; }\r\n        public int angle { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Range\r\n    {\r\n        public string start { get; set; }\r\n        public string end { get; set; }\r\n    }\r\n\r\n\r\n    public class Scene\r\n    {\r\n        public int id { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Sentiment\r\n    {\r\n        public string sentimentKey { get; set; }\r\n        public double seenDurationRatio { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public int id { get; set; }\r\n        public double averageScore { get; set; }\r\n        public string sentimentType { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Shot\r\n    {\r\n        public int id { get; set; }\r\n        public List&lt;string&gt; tags { get; set; }\r\n        public List&lt;KeyFrame&gt; keyFrames { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Speaker\r\n    {\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class SpeakerLongestMonolog\r\n    {\r\n        [JsonProperty(\"1\")]\r\n        public int _1 { get; set; }\r\n\r\n        [JsonProperty(\"2\")]\r\n        public int _2 { get; set; }\r\n    }\r\n\r\n    public class SpeakerNumberOfFragments\r\n    {\r\n        [JsonProperty(\"1\")]\r\n        public int _1 { get; set; }\r\n\r\n        [JsonProperty(\"2\")]\r\n        public int _2 { get; set; }\r\n    }\r\n\r\n    public class SpeakerTalkToListenRatio\r\n    {\r\n        [JsonProperty(\"1\")]\r\n        public double _1 { get; set; }\r\n\r\n        [JsonProperty(\"2\")]\r\n        public double _2 { get; set; }\r\n    }\r\n\r\n    public class SpeakerWordCount\r\n    {\r\n        [JsonProperty(\"1\")]\r\n        public int _1 { get; set; }\r\n\r\n        [JsonProperty(\"2\")]\r\n        public int _2 { get; set; }\r\n    }\r\n\r\n    public class Statistics\r\n    {\r\n        public int correspondenceCount { get; set; }\r\n        public SpeakerTalkToListenRatio speakerTalkToListenRatio { get; set; }\r\n        public SpeakerLongestMonolog speakerLongestMonolog { get; set; }\r\n        public SpeakerNumberOfFragments speakerNumberOfFragments { get; set; }\r\n        public SpeakerWordCount speakerWordCount { get; set; }\r\n    }\r\n\r\n    public class SummarizedInsights\r\n    {\r\n        public string name { get; set; }\r\n        public string id { get; set; }\r\n        public string privacyMode { get; set; }\r\n        public Duration duration { get; set; }\r\n        public string thumbnailVideoId { get; set; }\r\n        public string thumbnailId { get; set; }\r\n        public List&lt;Face&gt; faces { get; set; }\r\n        public List&lt;Keyword&gt; keywords { get; set; }\r\n        public List&lt;Sentiment&gt; sentiments { get; set; }\r\n        public List&lt;Emotion&gt; emotions { get; set; }\r\n        public List&lt;AudioEffect&gt; audioEffects { get; set; }\r\n        public List&lt;Label&gt; labels { get; set; }\r\n        public List&lt;FramePattern&gt; framePatterns { get; set; }\r\n        public List&lt;Brand&gt; brands { get; set; }\r\n        public List&lt;NamedLocation&gt; namedLocations { get; set; }\r\n        public List&lt;NamedPerson&gt; namedPeople { get; set; }\r\n        public Statistics statistics { get; set; }\r\n        public List&lt;Topic&gt; topics { get; set; }\r\n    }\r\n\r\n    public class TextualContentModeration\r\n    {\r\n        public int id { get; set; }\r\n        public int bannedWordsCount { get; set; }\r\n        public int bannedWordsRatio { get; set; }\r\n        public List&lt;object&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Thumbnail\r\n    {\r\n        public string id { get; set; }\r\n        public string fileName { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Topic\r\n    {\r\n        public string referenceUrl { get; set; }\r\n        public string iptcName { get; set; }\r\n        public string iabName { get; set; }\r\n        public double confidence { get; set; }\r\n        public int id { get; set; }\r\n        public string name { get; set; }\r\n        public List&lt;Appearance&gt; appearances { get; set; }\r\n        public string referenceId { get; set; }\r\n        public string fullName { get; set; }\r\n        public string referenceType { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Transcript\r\n    {\r\n        public int id { get; set; }\r\n        public string text { get; set; }\r\n        public double confidence { get; set; }\r\n        public int speakerId { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;Instance&gt; instances { get; set; }\r\n    }\r\n\r\n    public class Video\r\n    {\r\n        public string accountId { get; set; }\r\n        public string id { get; set; }\r\n        public string state { get; set; }\r\n        public string moderationState { get; set; }\r\n        public string reviewState { get; set; }\r\n        public string privacyMode { get; set; }\r\n        public string processingProgress { get; set; }\r\n        public string failureMessage { get; set; }\r\n        public object externalId { get; set; }\r\n        public object externalUrl { get; set; }\r\n        public object metadata { get; set; }\r\n        public Insights insights { get; set; }\r\n        public string thumbnailId { get; set; }\r\n        public int width { get; set; }\r\n        public int height { get; set; }\r\n        public bool detectSourceLanguage { get; set; }\r\n        public string languageAutoDetectMode { get; set; }\r\n        public string sourceLanguage { get; set; }\r\n        public List&lt;string&gt; sourceLanguages { get; set; }\r\n        public string language { get; set; }\r\n        public List&lt;string&gt; languages { get; set; }\r\n        public string indexingPreset { get; set; }\r\n        public string streamingPreset { get; set; }\r\n        public string linguisticModelId { get; set; }\r\n        public string personModelId { get; set; }\r\n        public object logoGroupId { get; set; }\r\n        public bool isAdult { get; set; }\r\n        public List&lt;object&gt; excludedAIs { get; set; }\r\n        public bool isSearchable { get; set; }\r\n        public string publishedUrl { get; set; }\r\n        public object publishedProxyUrl { get; set; }\r\n        public object viewToken { get; set; }\r\n    }\r\n\r\n    public class VideosRange\r\n    {\r\n        public string videoId { get; set; }\r\n        public Range range { get; set; }\r\n    }\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p>En futuros art\u00edculos, exploraremos c\u00f3mo utilizar la informaci\u00f3n extra\u00edda por Azure Video Indexer para construir un buscador de videos robusto. Este buscador se basar\u00e1 en Azure AI Search para el almacenamiento y la b\u00fasqueda de datos, Ada 002 para la generaci\u00f3n de vectores sem\u00e1nticos a partir de las transcripciones, y Node.js para el desarrollo del backend.<\/p>\n<p>&nbsp;<\/p>\n<p>Como demostraci\u00f3n temporal de esta prueba de concepto (POC), puedes acceder a una versi\u00f3n funcional en la siguiente URL: <a href=\"https:\/\/buscador-para-videos-ia.azurewebsites.net\">https:\/\/buscador-para-videos-ia.azurewebsites.net<\/a> .<\/p>\n<p>&nbsp;<\/p>\n<figure id=\"attachment_24\" aria-describedby=\"caption-attachment-24\" style=\"width: 840px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-24\" src=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/poc_video_indexer.png\" alt=\"\" width=\"850\" height=\"502\" srcset=\"https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/poc_video_indexer.png 850w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/poc_video_indexer-300x177.png 300w, https:\/\/elbrinner.com\/wp-content\/uploads\/2025\/03\/poc_video_indexer-768x454.png 768w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><figcaption id=\"caption-attachment-24\" class=\"wp-caption-text\">Poc Video Indexer<\/figcaption><\/figure>\n<p>Ten en cuenta que esta POC es provisional y podr\u00eda ser desactivada en cualquier momento.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>En este art\u00edculo, exploraremos las capacidades de Azure Video Indexer. Azure Video Indexer es un servicio de Microsoft que permite extraer informaci\u00f3n valiosa de nuestros videos. Es importante destacar que no se trata de un an\u00e1lisis en tiempo real. Los videos se pueden enviar mediante la API o directamente desde el Portal. Una vez procesado &#8230; <a title=\"Azure Video Indexer\" class=\"read-more\" href=\"https:\/\/elbrinner.com\/index.php\/2025\/02\/10\/azure-video-indexer\/\" aria-label=\"Leer m\u00e1s sobre Azure Video Indexer\">Leer m\u00e1s<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[10,20,19],"tags":[21],"class_list":["post-18","post","type-post","status-publish","format-standard","hentry","category-net","category-azure","category-azure-video-indexer","tag-azure-video-indexer"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/posts\/18","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/comments?post=18"}],"version-history":[{"count":1,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/posts\/18\/revisions"}],"predecessor-version":[{"id":25,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/posts\/18\/revisions\/25"}],"wp:attachment":[{"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/media?parent=18"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/categories?post=18"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/elbrinner.com\/index.php\/wp-json\/wp\/v2\/tags?post=18"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}